China Institute for Socio-Legal Studies, Shanghai Jiao Tong University

2023-11-01 [author] Zhi Zhenfeng preview：

[author]Zhi Zhenfeng

[content]

Zhi Zhenfeng: Information Content Governance for Generative Artificial Intelligence Large Models

*Author Zhi Zhenfeng Researcher, Institute of Law, Chinese Academy of Social Sciences

Abstract: Based on large arithmetic power and strong algorithms to handle large amounts of big data, generative artificial intelligence big models have excellent performance in the fields of natural language processing, computer vision, speech processing, etc., and have already been able to provide services such as content idea generation, digital person, dialogue search, code generation, etc., and also have rich application prospects in the fields of automated driving, financial risk control, healthcare, and Internet of Things. As a major change in Internet information technology, the logical reasoning ability of big models and their ability to "understand" human beings have been greatly enhanced, which not only becomes a powerful tool for human beings to produce creative information content, but also may greatly change the ecology of information content on the Internet, bringing about the proliferation of poor-quality information, contamination of the initial source, and impact on the ethics of society. It is necessary to balance development and safety and explore the way of incentive-compatible governance.

Keywords: generative artificial intelligence; big language modelling; information content; incentive compatibility; governance

Enhancing computers' integration of human knowledge and understanding of intentions, expanding the boundaries of human intelligence, and achieving smoother human-computer interactions have always been an important direction of information technology endeavours. With the launch of the ChatGPT by the US artificial intelligence research company Open AI (OpenAI) bursting into flames, a number of tech giants have continued to step up the generative AI race. Google in the chatbot Bard (Bard) and then released can "understand" and generate audio and video content multimodal "second-generation channel language model" (PaLM 2), Microsoft's new Bing (New Bing) search engine integration of multimodal "Generative Pretrained Transformer 4 (hereinafter referred to as GPT-4), Amazon also announced that it had joined the battle by releasing Titan. Baidu "Wenxin Yiyan", Huawei "Pangu", Tencent "Hunyuan", Ali "Tongyi Qianwen", Shangtang "SenseNova", Kunlun Wanwei "Tian Gong", Xunfei "Xinghuo" and other big models from China are constantly emerging. Various types of generative AI Large Language Models (hereinafter referred to as LLMs) have erupted, and the technology application boom has swept the world.

Based on large computing power, processing massive big data with strong algorithms, and training on large-scale unlabelled data to learn certain features or patterns to predict future results, the number of references has been raised from hundreds of millions to hundreds of billions, and it has achieved the leap from supporting a single task in a single modality of picture, image, text, and speech to supporting multiple tasks in multiple modalities, thus becoming a model library with the ability of generalisation It has achieved a leap from supporting a single task in a single modality of picture, image, text and speech to supporting multiple tasks in multiple modalities, thus becoming a model library with generalisation ability and certain generalisation ability. The big models "make miracles with great efforts", and have excellent performance in natural language processing, computer vision, speech processing, etc., and have been able to provide services such as content creation, digital human, dialogue search, code generation, etc., and are also very promising in the fields of automatic driving, financial risk control, healthcare, and Internet of Things.

Big models already have the ability to serve "thousands of industries". However, as a major change in Internet information technology, the logical reasoning ability of big models and their "understanding ability" of human beings have been greatly improved, bringing revolutionary changes in the generation of information content such as text, images, voice, video, etc., which will truly bring information content production and dissemination into the new era of Artificial Intelligence Generated Content (AIGC), triggering a knowledge revolution in human society. Artificial Intelligence Generated Content (hereinafter referred to as AIGC) into a new era, is triggering a knowledge revolution in human society. Artificial Intelligence Generated Content (AIGC) will not only become a powerful tool for human beings to produce creative information content, but also may greatly change the network information content ecosystem and bring new risks and challenges to information content governance by learning the characteristics of objects from huge amounts of data, no longer simply comparing and matching, and trying to understand people's thoughts, using existing text, image or audio files and generating content based on big data sets.

Since the high technical expertise of generative AI big models in information content production and dissemination is far away from people's established common sense, the first part of this article will mainly sort out the typical functions, application scenarios and important features of AI big models in content generation as briefly and concisely as possible; on this basis, the second part of the article will argue that, due to the significant impacts of the big models and their own unsurmountable limitations, they may bring significant risks to information content governance; and then, with a brief overview of information content governance at home and abroad, it will try to put forward a governance path for generative AI information content.

1. Generative artificial intelligence opens a new era of information content production and dissemination

Language has a special significance for human beings. Heidegger proposed that "language is the home of existence"; Wittgenstein bluntly stated that "the boundaries of my language mean the boundaries of my world". In the development of artificial intelligence technology, natural language processing has been regarded as "the jewel in the crown of artificial intelligence". How to enable computers to understand and process human language is an important key point in human-computer interaction. The natural language processing framework adopted by the Generative AI Big Language Model has made significant advances in human-computer dialogue and content generation capabilities, and can learn and train on large text datasets to produce complex and intelligent writing that can even be transformed into images or videos.

1.1 Revolutionary changes in the production and dissemination of information content

A history of mankind is a history of information production, exchange and dissemination. From the primitive society of oral transmission, agricultural society of simple paper and silk, to the industrial era of radio, television, and then to the Internet, especially the development of mobile communication technology, the production and dissemination of human information content is mainly user-generated content (UGC) and professionally-generated content (PGC) two modes of development. Before the Internet era, whether it was simple silk, books, newspapers or radio and television, the most easy to widely spread and long-lasting circulation is mainly professional-generated content, and the producers of its information content are mainly intellectuals, officials, professionals in certain fields and so on. In the age of mass media, there are also journalists, editors and other content producers and gatekeepers. On the whole, professionally generated content is more authoritative, reliable and of better quality. Comparatively speaking, the content of word-of-mouth and street gossip is mainly produced by users, and the producers may not be professionals, and generally do not have quality gatekeepers. The so-called "gossip" is mostly "self-produced and self-sold", and its rise is also sudden, and its death is also fast. However, in the Internet era, especially after the widespread use of social media technology, everyone has a microphone, everyone has a camera, cyberspace "street talk" can also be widely disseminated and long record, short videos make everyone have the opportunity to "be seen". Short videos give everyone the opportunity to be "seen". In cyberspace, user-generated content naturally has an overwhelming advantage in terms of quantity. On WeChat alone, hundreds of millions of audio and video calls are made and tens of billions of messages are sent every day. By the end of 2022, the user scale of China's network video (including short videos) reached 1.031 billion, and the user scale of webcasting reached 751 million. The production and dissemination of information content in human society has achieved a revolutionary shift from mainly professional production to mainly user production.

The emergence of generative AI big language models opens a new era of AI-generated content, which is another revolutionary sea change in the way human information content is produced and disseminated. The main body of information content production undergoes a huge mutation, and artificial intelligence can replace human resources in the whole process of information collection, screening and integration, and reasoning, greatly liberating human resources. Subversive changes in the efficiency of information content production, strong algorithms driven by large computing power to deal with big data, in natural language processing such as text classification, sentiment analysis, machine translation, question and answer systems, text generation, image classification, object detection, image segmentation, face recognition, image generation and other computer vision, vehicle control, road identification, traffic flow prediction and other automated driving, identification of fraud, assessment of risk, prediction of market Financial risk control such as fraud identification, risk assessment, and market change prediction, healthcare such as disease diagnosis, pathology analysis, and medical image analysis, as well as Internet of Things (IoT) such as smart home, smart manufacturing, and environmental monitoring, can make high-quality judgement of the results and efficiently generate content in a wide range of tasks in various fields. There are disruptive changes in the dissemination of information content, and the production and dissemination of information has become more convenient, especially lowering the threshold for acquiring expertise. The expression of information content has become richer, and the use of artificial intelligence generation technology has made it possible to convert graphics, text and code more freely, and to generate a "digital person" in a single click, thus "opening the era of intelligent interconnection".

1.2 Content Generation Functions of Big Models

Big models already have multimodal and cross-modal information content production capabilities. As far as the big models released at home and abroad are concerned, in terms of information content generation, they mainly take natural language processing as the core architecture, use Transformer as a common module/interface, and rely on deep learning models with self-attention mechanism to generate content such as text or images similar to those created by human beings.GPT-4 can be used to generate information content by pre-training the model on a multimodal corpus with a variety of data, including textual data, arbitrarily interleaved images and text, the model can be pre-trained with a variety of data to acquire the ability to natively support multimodal tasks.

Based on Reinforcement Learning from Human Feedback (RLHF), large language models such as ChatGPT are able to learn and improve the output content based on user inputs, and can also realise "alignment" of AI model representations and intrinsic values with human common sense and values. "ChatGPT is also able to use Instruction Tuning technology to better adapt to the user's language habits and communication styles, and to understand the user's problems, thus improving the system's adaptability and performance for specific tasks and scenarios.

In terms of the output form of information content, generative AI models can already realise a variety of modalities such as text, images, video, audio, digital people and 3D content. Taking Shangtang's "SenseNova" big model series as an example, "SenseMirage" is a platform for creating text-generated pictures, which can generate pictures with real light and shadow, rich details, and varied styles, and can support the generation of 6K high-definition pictures. "SenseChat” is an efficient chat assistant that can solve complex problems in seconds, provide customised suggestions, and assist in the creation of first-class text, with the characteristics of continuous learning and evolution. "MingMou” is a data annotation platform with more than 10 built-in general models and industry-specific models, which supports intelligent annotation for 2D classification, detection and 3D detection in various scenarios, such as smart driving, smart traffic and smart city. "SenseAvatar” is an artificial intelligence digital human video generation platform, which can generate a digital human doppelganger with natural voice and movement, accurate lip-sync, and multi-language proficiency with only a 5-minute real-life video clip. Scene generation platform "Qiongyu" and object generation platform "Geji" are 3D content generation platforms that can efficiently and cost-effectively generate large-scale 3D scenes and refined objects, opening up new imaginative space for meta-universe and virtual-reality fusion applications.

Generative AI models are ushering in the era of Model as a Service (MaaS). Technology giants build generic models, provide models to B-side customers in niche areas (Tencent hybrids also target G-side customers), and have customers polish the models, thus empowering various industries. At the same time, the public test or pay to use interface is opened to C-end users to attract players with a deeper understanding of the industry to polish and train the model.

Generative AI interacts deeply with users to generate massive amounts of information, facilitating users in information search, product consumption, and public life participation, etc. ChatGPT-3 has 175 billion parameters and has been trained on about 500 billion texts collected on the web, and the massive amount of data and powerful computational power have created this very intelligent publicly available AI. 4 offers yet another performance improvement over other generative big language models and is an important step towards general-purpose artificial intelligence (AGI). Its general-purpose capabilities can be used in a wide range of scenarios across all types and domains, including abstraction, comprehension, vision, coding, and mathematics, medicine, law, and the understanding of human motivation and emotion, and in some domains, it can perform at or beyond the level of a human in terms of task completion.

1.3 Application Scenarios of Artificial Intelligence Large Models

Generative artificial intelligence can become a human chat partner, it is through pre-training technology to support the technical model to produce fluent, contextual, with a certain degree of common sense of the chat content, the dialogue shows a certain degree of "personality" rather than hard machine words, and therefore has the potential to become a virtual companion robot. In specific areas, through the learning of professional knowledge and the use of "fine-tuning" technology, the big model can undertake intelligent customer service "work". In search services, macromodels will be better able to understand human intent and generate the "answer" the user wants, rather than just providing a series of web links.

The most typical application of the big model is writing generation. According to the topic, keyword requirements, generative artificial intelligence can "write" stories, novels, poems, letters, news reports, current affairs commentary, thesis outlines, etc.; text modification and embellishment, such as grammatical corrections, text translation, keyword extraction. The large model can also write code, according to OpenAI's technical developers, through the training of large language models, it can generate functionally correct code bodies from natural language document strings. A user used ChatGPT to write an apology letter for the company involved in the 2023 Shanghai Auto Show "ice cream incident" in seconds, which turned out to be quicker and more appropriately worded than the company's PR copy, which was sharply commented on by netizens as "not as good as ChatGPT's PR level". "GPT-4 is also able to recognise content based on pictures, and can even understand pictures with specific connotations behind them, also known as "terrier pictures". The "Second Drawing" in the "Day by Day" series of large models from Shangtang, as well as Stable Diffusion and Midjourney, can generate very creative images using textual hints.

Generative AI models have already begun to show the ability of "experts" in various fields, and can engage in a certain degree of basic medical consultation, legal services, education quizzes and other professional knowledge in a number of fields, such as question and answer and analysis. For example, SenseNova can "talk" about patent law; GPT-4 can answer exam questions, analyse charts and diagrams, and can gradually guide and encourage users to think and get answers just like a real human teacher. ChatGPT can help legal practitioners brainstorm, improve case analysis and paperwork writing, organise citations and more.

Generative AI macromodels can be resourceful personal assistants. In life, ChatGPT can help order restaurants, book movie tickets, make travel plans, get weather forecasts, etc. It can also recommend relevant news, music, movies, books, etc., according to the user's interests, and customise travel routes, schedules, reminders, etc., according to the user's hobbies, working hours and location. For example, after accessing Alibaba's big model Tongyi Qianqian, the application "DingTalk" can fully assist the office, can create poems and novels, write emails, generate marketing plans, etc.; in the DingTalk meeting can be generated at any time to generate meeting records and automatically summarise the minutes of the meeting, to generate to-do items.

Generative AI is also promising in the areas of product design, deep synthesis and manufacturing. In many scenarios such as logo design, apparel design, Internet content illustration, e-commerce illustration, etc., text-generated diagrams, diagram-generated text, and other graphic creative content generation functions can be used in a big way. The big model can also generate marketing planning programmes based on descriptive text, application applets based on functional sketches, education, smart business, smart city and other businesses, opening up the closed loop of applications in multiple fields and industries. In addition, through the connection of smart life, wisdom according to the text description of the AI face change and other depth of the synthesis function of a key to generate a digital human bilocation, is also more powerful. Use it for 3D printing, you can directly manufacture industrial products.

1.4 Characteristics of AI information content generation

In addition to professional-generated content (PGC), user-generated content (UGC) and hybrid generation and other Internet information content production modes, artificial intelligence-generated content (AIGC) mode is becoming more and more significant in its impact, bringing about the evolution of the content production body and production mode, the improvement of the content interaction mode and the distribution mode, and the improvement of the quality of the content production and the effect of the generation. AI-generated content has some extremely significant and revolutionary features.

Information content acquisition has achieved a shift from display to generation. The big model of artificial intelligence is able to summarise and generalise existing human knowledge very well, streamline and efficiently output according to massive data, and greatly improve the ability of human beings to produce and acquire information. It can write or draft texts, replacing some human labour. It has changed the way knowledge is generated and delivered, greatly lowering the threshold of access to expertise, so that expertise generation no longer requires decades of human professional training to obtain. In contrast to the former AI tools used within media organisations, this generation of generative AI applications is open to all users, bringing with it the possibility of self-releasing and self-creation, a kind of information universality, and thus a narrowing of the knowledge divide in society.

There is a shift from fragmentation to integration in the provision of information content. Before the big model of AI, the information people obtained on the Internet mainly came from various dispersed web pages, knowledge communities, online encyclopaedias and so on. However, generative artificial intelligence has completed the integration of massive public knowledge by integrating information and analyzing data, and it can interact with human dialogue, thus integrating various functions such as search engines, encyclopedic knowledge platforms, knowledge communities, software open source communities, and part of the social media, etc., and carrying out streamlined and efficient inductive outputs based on the massive knowledge that it inherits, which greatly improves the ability of human beings to obtain information. To a certain extent, the big model integrates the search, finding, integration and preliminary output of information, which is conducive to promoting the transmission, dissemination and inheritance of knowledge.

Service scenarios have achieved a shift from single domain to generality. Generative AI large language models have better generalisability, accuracy and efficiency, and can be learned on large datasets through pre-training or other means, and then fine-tuned to efficiently handle complex tasks such as computer vision and natural language processing. Big Language Models use a huge number of corpora covering a wide range of subject domains in the training process, and can mimic human intelligence in a wider range of application domains. As a "foundational model at the code layer" in the "model-as-a-service" implementation, generative AI Big Language Models have the ability to become the next-generation infrastructure for a wide range of applications from It can be used in a variety of downstream scenarios from search engines, content platforms, and application software, including daily work, scientific research and education, and even public services, affecting a wide range of industries. The developers of the Keystone Model are thus the "gatekeepers" of the digital technology market, with strong market power. This is a truly epoch-making product in the history of AI development: if AlphaGo marks the point where narrow AI meets and exceeds human capabilities in specialised fields, ChatGPT opens the era of general AI - that is, the era where AI has extensive learning capabilities and meets or exceeds the capabilities of ordinary human beings in most fields. that meets or exceeds ordinary human capabilities.

Dialogue has achieved a shift from one-way retrieval to intelligent interaction. How to make computers no longer cold machines, how to enhance computers' understanding of humans, and how to make it easier for humans to access information are all important drivers of information technology development. Before the generative artificial intelligence big language model, human beings acquire knowledge and information either by face-to-face communication, or by checking books and materials, or by Internet search engine. It was one-way and boring in the way of acquiring information. In addition to the communication between human beings, the relationship between human beings and books and computer networks is a cold "subject-object" relationship. However, generative AI big language models have dramatically changed the way human beings dialogue when acquiring knowledge and information. Taking ChatGPT as an example, through the generative pre-training model of massive data, based on a large amount of Internet text for training, it can understand and answer questions on various topics, and be able to make natural language expressions with human-like rather than machine-like discourse systems.ChatGPT-3 has already possessed the remarkable ability of context learning, which can predict contextual vocabulary and learn or imitate patterns in the data, through corresponding key information matching and pattern imitation. Output the corresponding contextual responses by corresponding key information matching and pattern imitation. As the number of model parameters increases and the context learning capability grows, it is able to guarantee the continuity of human-computer dialogue and actively ask follow-up questions to the user when it cannot understand the instructions. This covers a layer of "personalised" communication for humans to obtain information through the large model, making computer information retrieval no longer a cold machine operation, but an intelligent interaction with a "human touch".

2. Generative artificial intelligence brings new challenges to information content governance

In a sense, the generative AI grand model is becoming an aggregate of human information content production and dissemination. Books, newspapers, radio, television and other information content carriers, news media, search engines, knowledge communities, online encyclopaedias, open-source communities and other information provision tools, customer service, writers, doctors, teachers, experts and other specific professional identities, all of which are integrated into the generative artificial intelligence big model. The big model has become a textbook, a source of knowledge, a "master teacher", "authority", able to "monopolise knowledge", "influence judgement", "shape cognition", "influence judgement", "influence judgement", "shape cognition", "influence judgement", "influence judgement", "influence judgement", "influence judgement", "shape cognition". "Shaping Cognition". Big language models have the potential to penetrate into all areas of human production and life, but the limitations of the technology itself and the problem of technical abuse will bring serious challenges to information content governance.

2.1 Technical limitations

There are flaws and limitations in the training data. The astronomical data required for pre-training of large models cannot all be verified for accuracy, and if the data is inaccurate or missing, the reliability of the results will inevitably be affected, resulting in "rubbish in, rubbish out". If the data is biased and contains sensitive information, it may also lead to discrimination and misperception in the generated results.In 2017, a study has demonstrated bias and stereotyping in natural language processing data by analysing the Stanford Natural Language Reasoning (SNLI) corpus. Without access to the Internet or using plug-ins, the knowledge of big models is often time-bound, for example, the knowledge possessed by GPT3.5 is limited to events occurring before 2021; Google's Bard claims to be able to search for information on the Internet, but there is still a certain time lag. They suffer from arithmetic limitations, insufficient training, and high R&D and operational costs. Big model training is a violent aesthetic, requiring big computing power, big data and big models, and every training task is costly. The minutes released by Shangtang show that on the cloud computing power side, at least 10,000 A100 chips are needed to run ChatGPT, and at present, only Shangtang, Baidu, Tencent, Byte, Ali and Phantom Fang have more than 10,000 reserves in China, with a huge arithmetic gap and extremely high costs.

Content generation has an upper limit. The combination of high probability may not be real, and it is difficult to be creative.AI models like ChatGPT can only react based on trained information, and cannot really access real-time facts or understand context like humans. Firstly, AI content generation is still effectively knowledge restructuring rather than knowledge production or reproduction. On the one hand, there is still a gap with human intelligence, the ability to understand context is still limited, and it lacks the "human touch", which can only pursue short and large amounts, but cannot produce meaningful and innovative content. The answers output by the model are generated by its pre-trained neural network, whose parameters are randomly initialised and optimised by stochastic gradient descent based on the input data during the training process, which makes it possible for the model to give different or even opposite answers to the same question. Sometimes the answer given will be "convincing", sometimes "serious nonsense", and when challenged, it will "improvise" or "deny". "denial" when challenged, essentially due to the fact that the output is randomly selected, probabilistic and unpredictable amongst a number of alternative answers. On the other hand, the quality of the output depends heavily on the ability of the user to ask questions (Prompt). For information in specialised domains, there is a contradiction between generalisation and specialisation in natural language processing, and it is difficult to ensure that the results are easy to read without reducing their specialisation. Secondly, there is the common problem of "Hallucination", which makes the content "look right, but essentially wrong". The inevitable bias caused by information compression in the model training set, where no extra words are given, is that the output generated by the model contains information that does not match the input, which may be incorrect, irrelevant, or absurd, creating semantically expansive or irrelevant scenarios that are unavoidable. Large-model AI has the appearance of personhood, but still cannot really have a personality. In digital systems, AIs are not human, and "hallucinations" and "assertive responses" are inevitable. Thirdly, cross-linguistic and cross-cultural problems, multi-language corpus collection, may not be able to grasp the connotation behind the corpus. In the GPT-3 training dataset released by OpenAI, the English corpus is as high as 92.65%, while the second-ranked French only accounts for 1.92%. The corpus input determines the result output to a large extent. Too little use of Chinese corpus in big model training will greatly affect not only the quality of the content generated by the big model, but also the Chinese civilisation, which uses Chinese language as the main ideographic tool.

Content review has unmanageable complexity. Due to inherent algorithmic black boxes and interpretability flaws, it is difficult to understand the reasoning behind the model's predictions, and ChatGPT writes on its website that the sheer volume of content generated by these models makes manual review and vetting of the generated content very difficult. According to OpenAI's paper, despite these same technical limitations, GPT-4 is ostensibly "more convincing and credible than the earlier GPT models". This will create an even bigger problem. When users become overly reliant on it, they are likely to be unvigilant or overlook errors in its use.

2.2 Risks of Generative Big Language Models Application

Due to the massive amount of training data required for generative AI big models, as well as their generative, prioritised, integrated, and generic characteristics, various huge risks will arise while they empower thousands of industries.

2.2.1 Risk of personal information leakage

The process of users' dialogue with generative AI big language models is the process of personal information being widely collected. When users ask questions, they may expose personal information that they do not want to disclose. However, according to OpenAI's instructions, users can only delete their personal accounts, not sensitive personal information.On March 20, ChatGPT's open source library had a vulnerability that allowed some users to see other users' conversations, names, email addresses, and even payment information.OpenAI had to prompt on its official website, "Please don't share in the conversation any sensitive information." Indeed, when generative AI is asked to answer questions or perform tasks, information inadvertently provided by the user may be used in the training, learning and improvement process of the model, and thus placed in the public domain. This may not only violate a user's privacy, but may also reveal information about others. For example, when an attorney uses it to review a drafted divorce settlement, it may reveal personal information about the parties to the case. In particular, the Big Model demonstrates powerful reasoning capabilities, and it is able to write programmes based on user needs, which will improve the user's product experience on the one hand, but on the other hand, it may also pose the risk of personal information leakage.

2.2.2 Risk of Trade Secret Leakage

It has already been reported that Samsung's semiconductor division had three incidents of trade secret leakage due to the use of ChatGPT: an employee asked it to check the source code of a sensitive database for errors, an employee used it for code optimisation, and another employee entered a recorded meeting into ChatGPT and asked it to generate the minutes of the meeting. Whether it is a market entity, an academic institution or a government agency, when using a big model, it is inevitable that they have to share certain information with it, thus there is a huge risk of leaking business secrets or even confidential state information.

2.2.3 Data Security Risks

The data used for training may be inaccurate or biased, and the quality of the data is not guaranteed, and it is even difficult to guarantee the legitimacy of the data, resulting in the generation of content that may be "toxic". With more and more industries and fields accessing AI-generated big language models, the risk of data leakage and compliance is becoming more and more prominent, and the leakage of data as a factor of production will bring huge economic and reputational losses to enterprises and industries. Even if the information is fragmentary or piecemeal, ChatGPT may combine it with other data to mine and analyse it, so as to infer intelligence information related to national security, public safety, and the legitimate rights and interests of individuals and organisations. Especially for models with servers overseas such as ChatGPT and Bard, if sensitive data is entered during use, it may raise security issues of cross-border flow of data, which can bring about data security or even national security threats.

2.2.4 Cybersecurity risks

Generative AI has the potential to provide a convenient tool for cybercrime due to the lowering of the threshold of expertise and the difficulty of models to recognise the purpose of a user's use. By writing cyber-attack code, it is able to generate code in multiple languages such as python, javascript, etc., can create malware to detect sensitive user data, and can also hack into a target's entire computer system or email account to obtain important information. An expert has detailed how to use ChatGPT to create polymorphic malware that bypasses the content policy filters established by OpenAI to create malicious code. Criminals can create cyber-scam scripts by simply asking the model to compose marketing emails, shopping notifications or software updates in English in their native language, with few signs of spelling mistakes and grammatical errors that would make it difficult to recognise as a scam message or phishing email. In addition, information from large models used in the training process of account information may be shared with service providers and related companies, which may lead to the risk of data leakage in the process, leaving vulnerabilities for cybersecurity attacks.

2.2.5 Algorithm Risk

Generative AI essentially uses algorithms to process massive amounts of data, and algorithms are the key. However, because the algorithm itself is not yet able to verify the training data, it may often generate misleading content that appears to be accurate but is essentially incorrect, creating an "illusion". The limited accuracy of model-generated content and the inability of the model itself to recognise the authenticity of the written content can easily lead to the problem of generating and disseminating false information. Moreover, algorithms are not immune to social biases and values. Algorithms with their own problems may be guided to generate content that violates laws and regulations. Value judgements in data use and algorithm training may also generate "toxic" content, solidifying social bias and discrimination, not only on the basis of race, but also on the basis of gender, faith, political stance, social status, etc.

2.3 New Challenges in Content Governance

Generative AI Big Language Models have the potential to replace the entire process of human thinking in information gathering, knowledge acquisition, content evaluation, and thoughtful reasoning. In particular, the big model has advantages in natural language processing, computer vision and other fields, and it may generate huge information content risks when generating graphic content and conducting human-computer dialogues due to its lower information production costs, lower threshold of professional knowledge, more aggregated application functions, and wider fields of use.

2.3.1 Generativity creates a flood of poor quality information

Generative AI can write or draft texts to replace some of the human labour, production costs will be negligible and the volume of texts will soar. The huge growth of content will not only put pressure on the available physical memory recording space and bring about an explosion of information, but more importantly will cause the high-speed expansion and massive spread of harmful or bad content.

First, false information deteriorates the network ecology. Generative AI may fabricate false information, output low-quality information that mixes real information with false information, and use fluent sentences to elaborate on the fabricated false facts, making nonsense in a serious manner, which is confusing to groups with limited information sources. "Automation bias" predisposes users to believe the answers output by seemingly neutral models. If the powerful content creation capabilities of generative AI are used to generate inaccurate information against individuals and enterprises, it will lead to rumour mongering, slander, insults, and defamation; in particular, the use of in-depth synthesis technology to generate text, pictures, or videos of speeches by fake politicians or key figures may also lead to social unrest and more harmful consequences.

Second, misleading information interferes with personal decision-making and daily life. Generative AI is increasingly appearing as a "knowledge authority", producing erroneous or misleading content in various professional consulting services such as business planning, legal services, and medical and healthcare services, which will have a direct impact on the user's daily life. When used for transaction planning, due to the existence of "illusion", limited accuracy, and limited understanding of context, it is easy to "make up" situations and plan wrong itineraries and schedules. When applied to medical and health care, legal services and other professional consultation, once the knowledge is answered incorrectly, the information generated may mislead the user and interfere with his/her medical consultation or legal litigation activities.

2.3.2 Initial source pollution

Traditional knowledge sources such as textbooks and news media are increasingly replaced by online platforms. The big model, which combines the functions of knowledge platform, search platform and generation platform, has the potential to become both a monopoly knowledge source and source pollution from the source. Information content is created without human oversight, and the ability to produce malicious information on a large scale becomes easier and faster. In the proliferation of online "echo chambers" and "filter bubbles", the creation of large quantities of unverified, one-sided content will create a false sense of majority opinion and exacerbate the polarisation of opinions.

First, it is a misleading view of history. History is objective, but the perception of history can be subjective. Particularly in the international community, distortions of history abound as a result of ideological conflicts and value bias. In recent years, there have been constant disagreements in Western societies around the understanding of World War II; around the issue of the War of Resistance Against Japan, China, South Korea and other Asian countries have often criticised Japan for embellishing its war of aggression and distorting the words and deeds of history. As a creation of mankind, it is more difficult for large models to avoid the biases that humans have. In fact, it is not uncommon to amplify political bias and manipulate users' perceptions through content such as false or misleading information in answers to political questions, which may pose an even greater security risk once combined with bot accounts in cyberspace. Many tests have found that big Western models often reflect Western positions and values when it comes to China-related issues, even distorting history and misrepresenting facts.

Second, ideological and value bias. Big language models may have various social biases and worldviews that may not represent the user's intentions or widely shared values. The real society is not an ideal country with a common world; different countries, political forces, and interest groups have quite different ideologies and values, and present realistic power structures that are reflected in various types of information. The datasets required for big model training often encode the ideologies and values of the real society, which may lead to the consequences of reinforcing them. Research has shown that much of the data in Western Big Model training sets is generated primarily from the perspectives of white, male, Western, English-speaking people, so the data may be heavily skewed to reflect these structures. The power structures of real societies are encoded in the Big Model, and the Big Model outputs reflect the content of real power structures, creating a Matthew effect of power, which often results in the creation of a reproduction system of oppression that undermines the information ecosystem. Especially in areas involving religion, human rights and other ideological and value issues, areas of intense conflict of national interests, and even on extreme topics such as the superiority or inferiority of human races and civilisations, monopolising the big model is equivalent to monopolising textbooks, encyclopaedias and libraries. The big model will become a sharp weapon for cognitive domain warfare, shaping public perception, and manipulating international public opinion.

Third, the challenge of linguistic hegemony. The scale effect of the digital age makes small languages face very big challenges. Language is the home of existence, the carrier of culture, and the presentation of civilisation. Although generative artificial intelligence can provide multilingual and cross-language services, large model training requires a huge corpus, and even large models such as the domestic Wenxin Yiyin are trained by code based on the English language environment, which may not only have value bias, but also fierce competition between different languages and the civilisations they represent. If we cannot master the integrated and monopolistic platforms such as big models, a nation may not even be able to keep its language in the end, or even gradually dissolve towards a bubble.

2.3.3 Ethical risks of generalisability

In an atomistic, individualistic society, generative AI is increasingly becoming people's chatting partner and intimate "friend", which brings a series of ethical challenges.

Firstly, human beings may become more confused and misperceive what a "human being" is. Due to the over-competition and inward-looking nature of the real world, and the influence of increasingly atomistic individualistic values, individuals in modern society are becoming more and more lonely, and people are becoming more and more alienated from each other. Large models of generative AI can support chatbots, companion robots, and even become "companions" for many lonely individuals, but they can also exacerbate interpersonal alienation and the isolation of individual lives. Technology helps humans, but it may make them unhappier.

Secondly, it will limit the decision-making ability of individuals and weaken the subjective status of human beings. Generative artificial intelligence presents the trend of de-bodying, de-reality, de-openness and de-privacy, which hides the risk of algorithmic deprivation of human subjectivity more thoroughly, and is in essence an alienating manifestation of human-machine domestication. Human-computer communication will crowd out the space of interpersonal communication, thus weakening the social and psychological relevance of the embodied subject: social relations will no longer require the "presence" of the body, and "public life" will disappear. In other words, humans create algorithms, but algorithms have the potential to regulate and reformat humans, subtly altering their behaviour and values and eroding their subjectivity. People may delegate the final decision to some automated text generator, as they do today when they ask Google about existential questions.

Thirdly, content innovation and knowledge advancement will be stymied. When big language models are applied to writing generation, they may generate problems such as manuscript washing, plagiarism, and academic misconduct. Some foreign universities have begun to ban the use of ChatGPT on their campuses to prevent students from cheating in exams or essay writing. Some famous international journals also explicitly do not accept AI as a collaborator. Big models may be great tutors, but they can also be used as cheating artefacts. Particularly in the case of minors, over-reliance on generative AI will limit the growth of individual minds, thereby jeopardising sound character, schooling and academic training. As big models simplify access to answers or information, effortlessly generated information may negatively impact students' critical thinking and problem-solving skills, amplifying laziness and counteracting learners' interest in conducting their own investigations and arriving at their own conclusions or solutions.

Fourthly, it promotes false propaganda and public opinion manipulation. In the era of self-media development, public opinion manipulation has become a more serious problem. In the 2008 Iranian presidential election dispute, the American social media Twitter became an important support tool for the opposition. Through the use of social media, the opposition greatly reduced mobilisation costs and thus increased mobilisation capacity. The U.S. government in that year's "funding for Iranian dissidents report" clearly expressed the funding of "new media", and even directly asked the Twitter official to postpone the system maintenance, so as to avoid the opposition to lose contact channels. The inaccurate information originating from Twitter has also been amplified by traditional media such as CNN and BBC. But cleverness is often self-defeating for opinion manipulators. After the Cambridge Analytica incident, some American scholars have already predicted that large-scale generative AI models represented by ChatGPT will become a powerful tool for targeting candidates and influencing public opinion in the next round of elections.

3. Status of Information Content Governance with Generative AI Large Language Models

Artificial Intelligence brings great possibilities and raises great concerns. Humans must prepare for possible risks of loss of control in advance, and universal legislation on the safety and ethics of general AI research and development is imminent. Due to the high degree of certainty of the target of regulation, the relevant legislation in the field of specialised AI is becoming increasingly mature. For example, norms for different fields such as autonomous driving, smart healthcare, algorithmic push, AI investment advisors, face recognition, etc., can all find their counterparts in the laws of various countries and regions at different levels. How to maximise the effectiveness of generative AI technology while reducing the negative impact of emerging technologies on social development becomes an important global issue.

3.1 Big Model Regulation Becomes an Important Issue in Europe and the United States

There are already many people in the science and technology and industry sectors who have shown caution about generative AI. They believe that AI systems may pose a profound risk to human society, and that advanced AI may represent a profound change in the history of life on Earth, and should be planned and managed with appropriate care and resources. And now that AI labs are in an out-of-control race, with no one able to understand, predict or control the big models, it is important to press the pause button on development, dramatically accelerate AI governance, and make regulations on AI research and development. At one point, the Italian data protection authority issued an injunction against ChatGPT and investigated it for alleged violations of European privacy regulations. However, because the generative AI big model is still a small lotus tip, countries around the world have not been able to form a systematic regulatory policy and regulatory system.

In a proposed adjustment to the legislative progress, the EU has decided to set up specialised working groups to facilitate cooperation and exchange information on possible enforcement actions by data protection authorities. Privacy regulators in some EU countries have said they will monitor ChatGPT for the risk of personal data breaches under the EU General Data Protection Regulation (GDPR). The European Consumer Organisation (BEUC) has launched a call for European regulators at EU and national level to investigate ChatGPT. The EU is tweaking its AI bill on the regulation of general-purpose AI such as generative big language models, considering requiring OpenAI to undergo external audits of system performance, predictability and safety settings that are interpretable. Under the regulatory framework envisioned in the EU AI bill, generative macrolanguage models would be classified as high-risk and subject to strict regulation due to their potential to create harmful and misleading content.

The U.S. government has also begun to take initiatives.2023 On 30 March, the U.S. Federal Trade Commission (FTC) received a complaint initiated by the Center for Artificial Intelligence and Digital Policy (CAIDP), a nonprofit research organisation, arguing that GPT-4 does not satisfy any of the FTC's requirements for the use of AI to be "transparent, explainable, fair, and empirically sound while promoting accountability " and is "biased, deceptive, and a risk to privacy and public safety," calling for an investigation of OpenAI and its product, GPT-4, to determine whether it is complying with guidance issued by the U.S. federal agency.On May 4, the Biden administration announced its intention to further promote responsible U.S. innovation in the field of artificial intelligence. responsible innovation in artificial intelligence, it will conduct a public assessment of existing generative AI systems. In accordance with the Principles for Responsible Disclosure of Artificial Intelligence, a group of leading AI developers, such as Google and Microsoft, will be required to conduct a public assessment on a specific AI system assessment platform to provide researchers and the public with key information about the impact model and assess compliance with the principles and practices in the Blueprint for an Artificial Intelligence Bill of Rights and the AI Risk Management Framework in order to promote timely action by AI developers to address issues In January 2021, the U.S. Congress passed the National Artificial Intelligence Initiative Act (NAIIA), which aims to advance U.S. competitiveness in AI.

Generative AI, at the forefront of technological competition, has in fact become the preserve of a few countries. Most countries have yet to make a dent in technology development, industrial deployment and regulatory governance. Moreover, foreign AI regulation currently still focuses mainly on traditional AI rather than generative AI big language models. However, due to objective societal concerns about generative big language models, there exists a voice in the EU that requires generative AI big models to comply with high-risk obligations, which could have a significant adverse impact on the playing field for local government, industry and business targets.

3.2 Status of Chinese Regulation

China has initially formed a three-dimensional, all-encompassing network information content governance normative system composed of laws, administrative regulations, judicial interpretations, departmental regulations and a series of normative documents. Generative large language model of information content governance already has a basic legal framework, with a framework of institutional constraints that enable it to develop under the premise of not harming national security, public interests, individual rights and interests.

In terms of information content regulation, the information content security regulatory framework, consisting of the Criminal Law, the Civil Code, the National Security Law, the Anti-Terrorism Law, the Law on Public Security Administration Punishments, as well as laws and regulations such as the Cybersecurity Law, the Law on the Protection of Personal Information, and the Measures for the Administration of Internet Information Services, explicitly prohibits harmful information, such as information that endangers national security, social stability, and false information. The Provisions on Ecological Governance of Network Information Content also incorporate vulgar information and negative information, which have always been in the grey zone, into the legislative framework, highlighting the diversification of governance subjects and objects. Norms such as the Provisions on the Administration of Network Audio and Video Information Services and the Provisions on the Administration of Internet Follow-Up Commentary Services have further constructed a platform-wide coverage of the information content regulatory mechanism, providing a basis for content regulation of generative big language models.

In terms of risk response to artificial intelligence algorithms, the Administrative Provisions on Algorithmic Recommendation regulate algorithmic recommendation services, opening up the rule of law process of algorithmic governance. The Administrative Provisions on Deep Synthesis of Internet Information Services addresses the use of generative synthesis algorithms such as deep learning to produce text, images and other network information, and regulates technologies for generating or editing textual content such as chapter generation, text style conversion, and question-and-answer dialogues, providing foundational rules for the application of generative biglanguage models.

The Measures for the Administration of Generative Artificial Intelligence Services (Exposure Draft), which ended its consultation on 10 May 2023, puts forward a series of regulatory ideas for generative AI services in terms of the whole process of data use, personal information collection, content generation, and content prompt labelling. However, the balance between safety and development is not easy to pin down. Regulation first, certainly reflects the keenness of the regulator, but the impact on the development of the industry should also be carefully weighed. Generative artificial intelligence represents a new generation of information technology is currently an important high point in the field of international competition, because China is in the initial stage of this technology, the industrial foundation is not strong enough, the application of the impact of the accumulated experience is not sufficient, in the early stages of the research and development of local generative large language modelling technology, the responsibility of the developer is too harsh settings, but also has the potential to limit the development of the industry. For example, it should be discerned in detail as to whether the service provider should be liable for product infringement or other liabilities for the damage issues that may be caused by generative AI. The principle of inclusiveness and prudence should be adhered to, leaving enough space for technological and industrial innovation on the premise of safeguarding national and social security.

4. Exploring generative artificial intelligence for information content governance

Cybersecurity is relative, not absolute, and "zero risk" is not a scientific goal. In the process of model development, it is objectively difficult for the developer to foresee all potential risks, and it is necessary to explore and practice in a more relaxed environment and within reasonable limits. Risks brought about by technological advances can only be constrained, but not avoided. For example, problems such as poor accuracy due to the "illusion" of large models and difficulties in accountability due to algorithmic black boxes can only be controlled as much as possible, but can not be completely eliminated.

4.1 Incentive Compatibility: Optimising the Rule of Law Environment for Big Model Development

A new round of technological and industrial revolutions are flourishing, and each industrial revolution will have a significant impact on the rise and fall of the country, the prosperity and decline of the nation, and the glory and decline of civilization. In the context of the increasingly fierce game between China and the United States, and the extreme suppression and blockade and siege of the United States on our country, whether we have a generative AI big model, and whether our big model is sufficiently advanced and powerful, is the more fundamental question. Strict regulation should be placed on the basis of advanced technology and strong industry.

Influenced by the traditional planned economic system, and also due to the severe international environment at a specific stage, although the country has always adhered to the socialist market economic system and advocated the combination of a productive government and an effective market, some localities and departments are still used to deeply intervening into the market in the regulation of specific industries. Especially in the field of the Internet, due to the extreme importance of network security to national security, the Internet has become the forefront and main position of the ideological struggle, resulting in China's relatively strict regulation of the Internet industry in China as a whole. In the field of information content governance, the Standing Committee of the National People's Congress (NPC) has not yet passed any laws, and there are still only administrative regulations, such as the Measures for the Administration of Internet Information Services, which have been in force for more than 20 years., and the regulation of information content is mainly carried out in the form of departmental regulations, resulting in insufficient rule of law and excessive rigidity in industrial regulation, as well as more obvious one-way regulation in the form of top-down orders.

However, a great deal of empirical evidence, both domestic and foreign, proves that law and policy in modern societies have more than a regulatory function; they are important aspects of international institutional competition. As a comprehensive framework for solving social problems, law and regulation are neither the stricter nor the looser the better. They must maintain a judicious balance. If commanding and repressive top-down one-way regulation is too rigid, it will lead to difficulties in implementing the law or selective enforcement. The regulator, with too much power, will also face a greater dilemma of regulatory capture, which will ultimately inhibit technological innovation and industrial development, and miss opportunities for national development.

Against this background, developed countries have placed greater emphasis on incentive-based regulation in the recent past. It has been proved that if regulatory measures and rules can be compatible with the incentives of the regulated objects, it is not only easier to achieve the regulatory objectives, but also greatly reduces the cost of regulation and improves the motivation of compliance and law-abiding. Therefore, adherence to the principle of the rule of law and the implementation of incentive-compatible regulatory concepts and ideas have become an important part of optimising the business environment under the rule of law. Because of its stable expectations and long-term benefits, the rule of law has been hailed as the best business environment.

In the face of the rapid development of generative AI models, the legislative and regulatory authorities must express their respect and greater modesty for the market, innovation and industrial autonomy, leaving a broader space for the development of new technologies and applications. Considering that computing power is the basis for the development of big models, and computing power architecture is extremely expensive, in terms of legislation and policy choices, China should provide better policy space for financing new technologies and new industries. Considering the huge amount of data needed for big model training, under the premise of protecting personal information and data security, unreasonable obstacles in data training and other aspects should also be removed as far as possible in terms of regulation, so as to promote the reasonable circulation and utilisation of data elements. The law must conform to the law, and the regulation must conform to the reality. We should face up to the risks and challenges posed by generative AI, balance innovation and public interest, ensure the beneficial application of generative AI, avoid social risks, and ultimately establish an empowering regulatory concept and regulatory model that integrates development and safety and is in line with the objective law and stage of development.

4.2 Pluralistic Governance: Building a Governance Mechanism with Corporate Social Responsibility and Active Participation of Individuals

Technological innovation and industrial leap is the source of national prosperity, while adherence to the rule of law and scientific supervision is the institutional guarantee of national prosperity. Since the second half of the 20th century, the distinction between "management" legislation and "governance" legislation has become increasingly clear. The mode of social governance that corresponds to the era of high complexity and uncertainty should be a mode of co-operative action, and only when multiple social governance subjects carry out social governance activities together with the willingness to co-operate can we solve the wide variety of social problems that have arisen, and achieve excellent performance in social governance.

The Internet industry is highly specialised due to its own technological complexity. The history of Internet development shows that while the supporting role of the government and the state cannot be ignored, the role of the scientific and technical communities is equally important. Adhering to the spirit of open source, the exchanges between scientists and technical professionals and the consensus they have reached have greatly shaped Internet protocols, standards and rules, and given a strong impetus to the development of the international Internet. In particular, the Internet, as a new technology and industry, the complex world of codes and technological development behind it is often ahead of the everyday world and cannot be fully understood immediately by the society at large, including regulators. The development potential it contains is also not readily apparent. Without sufficient patience, tolerance and a philosophy of moderation and rationality, it is easy to stifle vital innovations for fear of risk. In the field of new Internet technologies and applications, the pursuit of absolute security often leads to greater insecurity. In this context, developed countries of the Internet, including China, often pursue the concept of pluralistic governance and social co-governance, which not only mobilises the full participation of enterprises and society, but also reserves a broad space for the development of new technologies and applications.

As a new trend in the development of Internet information technology, generative artificial intelligence big models have shown explosive and revolutionary potential, and by empowering thousands of industries as productivity tools, they are likely to bring great benefits to future technological innovation, industrial leapfrogging, social governance, personal well-being, and even become an important factor in the country's comprehensive competitive strength. In this scenario, the first step should be to support and enable the development and deployment of big models, while reinforcing corporate social responsibility, regulating data processing and personal information protection, ensuring that the development and application of AI models comply with moral and ethical standards, and promoting algorithms for the better. It is necessary to strengthen risk identification and data traceability, improve technical governance capabilities, clarify data sources and training processes, identify potential biases and other risks through data sets, and monitor content output and identify risks through manual review or the establishment of monitoring systems. Establish a feedback and complaint mechanism to receive, monitor and assess risks as they arise in real time and take timely remedial action.

The application and impact of big models of generative AI is global and requires the joint efforts of national R&D organisations to harmonise technical standards. As the No. 1 country in the Internet market, we also need to have a sense of participating in international Internet governance and providing Internet public goods to the international community, supporting our big models and platforms to participate in and organize the global technical community, and making Chinese contributions in terms of technology, ethics and rules.

Of course, it is also necessary to improve the digital literacy of citizens to avoid the digital divide brought about by the unbalanced application of generative big language models. Firstly, to enhance users' comprehensive understanding of the application of new technologies, and to encourage the public to view and evaluate new technologies in a scientifically rigorous manner, without blindly following or opposing them. Second, popularise knowledge about neural networks, deep learning and other technologies among the public, help people understand the operating principles and limitations of generative AI, and avoid technological dependence. Finally, to enhance the ability to identify true and false information, and to guide the public to maintain a certain degree of rational attitude and discernment towards the output of generative AI.

4.3 Governing by Law: Building a Legal Framework for Generative Large Language Model

In terms of Internet information content governance, China coordinates network ideological security with the overall national security concept, and under the framework of laws and regulations such as the National Security Law, the Cybersecurity Law, the Anti-Terrorism Law, and the Measures for the Administration of Internet Information Services, all network communication platforms engaging in news and information services, with media attributes and public opinion mobilisation functions have been included in the scope of management, and contents endangering national security, destroying national unity, and disrupting social stability are strictly prohibited. Content that endangers national security, undermines national unity and disturbs social stability is strictly prohibited. China has insisted on modernisation driven by informatisation. Firstly, it has coordinated the development of network information content with the construction of a strong network country, effectively promoting the rapid development of network information technology and the great enrichment of information content. Secondly, we have coordinated the construction of network information content with the construction of network civilisation, shaped upward and positive network culture, and promoted the public to consciously resist the erosion of illegal and undesirable information. Thirdly, the ecological governance of network information content is coordinated with the construction of network rule of law, effectively curbing the spread of illegal and undesirable information in cyberspace and optimising network ecology.

As a brand-new information content production and dissemination platform, the generative artificial intelligence big model has not yet shown its full picture, but its characteristics of generativity, integration, versatility, and intelligent interactivity are making it the main monopoly of information production and dissemination. Therefore, in terms of legislation and regulation, it is necessary to identify their risks as accurately as possible and improve the chain regulation from data to algorithms to content within the original information content governance framework. First, the collection, storage and use of user data should be regulated to prevent user data from being used for harmful purposes and to generate false, erroneous or misleading content. Second, the algorithm filing system should be improved, and companies should be guided to establish third-party auditing or self-regulatory mechanisms for various types of AI-generated content such as text, images and videos.Third, while identifying and regulating harmful information, the freedom of individuals to acquire knowledge and create content should be taken into account.

First, a scientific and clear mechanism for assuming legal responsibility should be established. For generative AI service providers, legislation is used to require them to ensure data reliability and accuracy; fulfil content auditing obligations to avoid the generation of harmful information; fulfil special labelling obligations to identify in-depth synthesised content in a conspicuous and eye-catching manner; and establish a mechanism for preventing, timely identifying and stopping the generation and dissemination of harmful and undesirable information. With regard to users, where the service provider has assumed responsibility for security management and has fulfilled its duty of prudence, users shall bear criminal liability for using the model as a tool for cybercrime. Other information platforms should screen false information and other harmful and undesirable information generated by models in a timely manner and prohibit or restrict the dissemination of information on the platform. Depending on the nature and consequences of different behaviours, different types of liability should be determined.

Secondly, the rule of law at the national level and the rule of law in relation to foreign countries should be integrated. At present, the mainstream generative artificial intelligence big models are mainly distributed in China and the United States, and the United States leads in big models and has great advantages. For foreign countries using generative big language models to infringe on China's interests, interfering in China's internal affairs with political manipulation and ideological bias, or transmitting other information suspected of violating the law and committing crimes, Article 50 of the Cybersecurity Law stipulates that "technological and other necessary measures shall be taken to block the dissemination of such information". In fact, in response to the use of generative artificial intelligence large language models by governments or related organizations from outside China to transmit information that violates China's laws and regulations, we should not only take technical measures to block them, but also explore the establishment of a countermeasure mechanism, so as to better safeguard the country's sovereignty, security and development interests.

download（英文）支振锋：生成式人工智能大模型的信息内容治理