Jan Philip Wahle
February 27, 2023
They’re already here. AI systems like ChatGPT generate human-like content with impressive quality. But just because we can use them doesn’t mean we should — a critical perspective on AI systems and how to make their use responsible.
Artificial intelligence (AI) has taken the world by storm, making remarkable strides in solving complex problems ranging from language translation to self-driving cars. While its solutions continue to amaze us, experts are questioning the use of AI in critical areas such as medical diagnosis and criminal justice. High-stakes scenarios have stirred up debates on the ethics of relying on algorithms to make critical, life-altering decisions.
Despite the novelty of AI for content generation, many are already using it to support everyday tasks such as software development or technical writing. Yes, even this blog post used ChatGPT to rephrase paragraphs and make the text more accessible to readers (more on this later). AI support assistants seem to reach mass adoption much faster than any technology that we know of so far.
A quick comparison. Netflix took 3.5 years to reach a million users. To be fair. The subscription model of Netflix is paid. However, Facebook, Spotify, and Instagram are not. And while Facebook took 10 months to gain 1 Mio. users, Instagram achieved the same in 2.5 months. Impressive! But not for ChatGPT. Only 5 days after its release, OpenAI, the company behind ChatGPT reported 1 million users. In five days.
With lightning-fast technological adoptions, the rules and norms to govern AI systems have yet to be determined. Regulations are typically much slower than innovation. Does that mean technological change to be completely prohibited and tightly controlled by regulatory agencies to prevent risks, or should it be allowed to advance without any restrictions, embracing the risks and benefits that come with it?
As with any other innovation, there are two main issues:
- The impacts and effects of using a certain technology are often unknown in the short term.
- Legislation is protracted and requires research based on measurements or historical data.
One of my favorite examples of technological revolution and following legislation is seatbelts for cars. While anyone nowadays would agree that seatbelts reduce traffic fatalities and save millions of lives every year, back in 1946 when the first models of seatbelts were designed, this idea was not that obvious and accepted. In fact, many car companies chose not to include seatbelts at all in their offering.
Ford offered seatbelts only ten years after its invention in 1955. And they were not particularly popular, with only 2% of Ford buyers choosing seatbelts in 1956. Another 10 years later, and Congress passed the National Traffic and Motor Vehicle Safety Act, requiring all automobiles to comply with certain safety standards, including seatbelts, in 1966.
The adoption of safety features in cars has been a slow and steady process that took two decades to become standard practice. While this may seem like a no-brainer, the implementation of safety features requires extensive evaluation to ensure that they do not pose any potential harm to individuals. The world of AI is far more expansive than the automotive industry, with the potential for endless applications in various fields. Despite the complexity of AI, there are already specific cases where we can explore its application, while in other scenarios, it may not be suitable at all.
When Should We Use AI?
We typically base our decisions on whether to use a technology or not on the risk associated with its outcomes. Risk is the severity of an event times the probability of its occurrence. For example, if human life is in danger (high severity) and the decision-making is 50% life-threatening (high probability), the risk is high. But if the output is a local news article (low severity) and the chance of describing misinformation is low (low probability), the risk is low too.
There inherently exist applications that are “high risk”. These include all decisions made about individuals including their life and well-being. The decisions can be anything from medical diagnoses and treatments to legal decisions. Some of these decisions should probably never be performed by a machine. But why is that?
Let’s assume there are already certain medical applications in which a machine would have a higher success rate than the average professional. Wouldn’t we want to use that system even if it means there will still be wrong diagnoses and potentially fatal outcomes? Why would we want to favor a human decision that can lead to objectively worse decisions and therefore more harm?
Favoring humans has mainly to do with trust and accountability. We believe humans decide to the best of their knowledge and with the best intentions. The machine has, as far as we know, no conscience. It decides purely rationale. It doesn’t really care if individuals suffer or die. It just takes the most probable decision that it was designed to take based on the situation. In addition, we can keep humans accountable for their wrongdoing in case they decide negligently.
Another important aspect to consider is the potential for AI to perpetuate biases and discrimination. AI systems are only as good as the data they are trained on, and if the data is biased or incomplete, then the AI system may learn and perpetuate those biases. Undeniably, biases also exist in humans. Until today, medical indicators for diagnosis and treatment are often not individualized for gender. Some diseases have entirely different symptoms for men and women. These biases transfer to other areas such as criminal justice and hiring decisions, where AI systems may unwittingly discriminate against certain groups.
One might argue that the system can only be biased on the information that we put in. That means if we don’t input gender and skin color into the system, it won’t be able to make decisions that are based on gender and skin color. Unfortunately, today’s AI systems have already learned complex second-level dependencies. For example, if passing the home address, systems like ChatGPT can infer from their training data, what would be the most probable skin color by looking at the demographical data of individuals in that neighborhood.
Given these concerns, some experts have suggested that AI should only be used in certain circumstances and with safeguards in place. For example, the European Union’s General Data Protection Regulation (GDPR) requires companies and organizations to ensure that they are not making decisions based on biased or incomplete data.
OpenAI argues that AI adoption is inevitable and shouldn’t be restricted in the short term because of its broad spectrum of application and should find integration into all aspects of society.
The optimal decisions will depend on the path the technology takes, and like any new field, most expert predictions have been wrong so far.
Proponents of this view argue that AI’s benefits outweigh the risks. They also point out that AI is already being used in many areas, such as healthcare and finance, and that it would be difficult to reverse integrations at this point. Opponents of this view see this as anthropomorphizing AI and playing down the seriousness of risks.
Successfully transitioning to a world with superintelligence is perhaps the most important — and hopeful, and scary — project in human history.
However, even those who fully support the use of AI, acknowledge the need for oversight and regulation. They argue that AI systems must be transparent and aligned with human values. There need to be safeguards in place to prevent discrimination and bias. While legislation and official policies will potentially take years, we need solutions now to prevent serious problems.
How Can We Use AI Responsibly?
One way to approach this question is to align AI with human values for responsible practice. That is, what do we require from doctors and lawyers to perform “high-stakes” decisions? Using a three-dimensional model, one can define crucial pillars:
- Transparency is the foundation of responsible AI practice, involving understanding model capabilities, data used in training, and limitations. Users should acknowledge the use of AI models in a standardized way, similar to other forms of acknowledgment.
- Integrity focuses on ensuring that generated content aligns with the users’ intentions and is free from unwanted bias. Authors need to fact-check and approve the generated content against their own findings and related studies to affirm its integrity.
- Accountability involves taking ownership of the outcomes produced by AI models, being open to feedback, and being transparent about the model’s limitations and potential risks associated with its use. It is important to hold users accountable for the decisions made using AI models and be willing to make changes as necessary.
In the recent past, there have emerged multiple ways to provide transparency. They are all based on the assumption that in order to make policy decisions, we need to know where AI models have been used, and what their impact was.
- Watermarking. Since the early days of paper production, watermarks have been used to prove the creator of a good. Nowadays many of these goods are digital and embedding watermarks in digital products has become a standard. Considering content generation models, there have been efforts to incorporate invisible watermarks that can later verify, for example, that a text was generated automatically. This works well if the model was created using an underlying watermark mechanism but with the resources of open source, replications of non-watermark models can bypass such mechanisms.
- Detecting. If watermarks fail, we can still come up with classifiers. Those can be machine learning models that were trained to identify machine-generated text up to a certain accuracy. This method can also identify unknown or new AI models up to a certain degree. However, when models or data significantly change, detectors are usually inaccurate, leaving us with inaccurate results.
- Reporting. Finally, we can give users the tools to provide transparency. We don’t want people to use AI hidden in the dark. Instead, given the limitations of when to use AI, it should be openly communicated that AI was used. Writing this blog post I also used ChatGPT to rephrase parts of the introduction and to make some arguments more clear. And ideally, we give individuals tools that can report AI usage in a standardized way that is traceable over time and machine-readable to be analyzed automatically. Of course, this has the assumption that people would actually use such a reporting system. This can be encouraged by different authorities, such as scientific conferences, or medical notes requiring such reports. For example, the Association for Computational Linguistics (ACL) already requires a checklist from authors for their scientific work to avoid common pitfalls.
A framework for reporting the use of AI systems also allows us to acquire the necessary data for legalization that was mentioned before and to give policymakers the information upon that they can build decisions based upon. Reporting also gives authors the possibility to be accountable from their side and verify integrity whereas detection and watermarking provide accountability only to some extent and don’t target the integrity issue.
In the past, some efforts have standardized the reporting of AI, for example for models or datasets. Recently, we have released a similar framework for AI usage. By answering a five-minute questionnaire, one can generate an individual card in a machine-readable format that can be included in many work products.
The advantage of reporting is that it is transparent and reliable, and includes accountability and integrity in the documentation process. Detection or watermarking can easily be circumvented when models and datasets change. Reporting with a standardized machine-readable format also allows researchers and policymakers of the future to analyze the past use of AI throughout time and areas.
Of course, there are also some obvious limitations. Individuals may not fully capture the broader social and cultural context in which they operate. Additionally, certain content aspects may not be fully assessed, which could limit the framework’s ability to comprehensively analyze the issue. Furthermore, people may not always provide truthful responses in their reports, which could introduce bias and limit the accuracy of the framework’s results.
Regardless, an incomplete solution is better than none. Vast technological changes have resulted in many analyst and expert predictions — most of them turned out wrong. Currently, there is no reliable way of knowing what role reporting AI content will bring in the future but it can certainly help analyze AI diachronically, provide data-driven insights for policymakers, and help the community steer when and how to use AI.
In conclusion, the use of AI is rapidly increasing, and its implications on society are complex and multifaceted. The deployment of AI poses significant challenges that require a continuous feedback loop of learning and iteration. As AI systems continue to evolve, society will need to grapple with complex questions about the ethical and practical implications of their use, including issues of bias, job displacement, and more. The best decisions will depend on the trajectory of the technology, but given the unpredictability of its evolution, planning in isolation is fraught with difficulty. As we continue to explore the potential of AI, we must remain vigilant, adaptive, and committed to a thoughtful, iterative approach. With these swift changes is vital that we approach AI with caution and a clear understanding of its potential benefits and risks.
AI Usage Card for this Blog Post
For this blog post, I have used ChatGPT (mainly for rephrasing or suggesting text). You can find the corresponding report in form of the previous AI Usage Card here.
Thanks to my dear friend and exceptional advisor Terry Ruas for providing feedback and suggestions for this post.