6
min read

Ethical AI standards: a process approach

This is part 1 of a 5-part series on Ethical AI.

This is part 1 of a 5-part series on Ethical AI.

Ethics alone is quite the messy topic. It permeates nearly every conceivable facet of our existence. From our day-to-day interactions with others, to our treatment of public property, to the policies enacted in the political landscape and indeed the social demands we place on companies. It seems on every level of our existence we have certain demands, needs, and expectations. And rightly so; for these demands, needs, and expectations help push us forward in pursuit of the good life. Enter now the bustling technological development of Artificial Intelligence, and the issue seems to have compounded, being even more complicated than before. With any new technology, we must take due care to set guidelines. By the good graces of philosophy, we are given the means to unpack the complexities of ethics, reducing them down to their barest essentials, and thus begin building up systems we can be confident are sound and just. To that end, we will view natural problems that arise in AI; namely that of human autonomy, privacy, transparency, and proper data infrastructure.

Human Autonomy

Human autonomy maintaining primacy over AI is essential to ethical AI. Why? Because it is by human action we arrive at moral considerations. A hurricane is neither moral nor immoral. We can call hurricanes ‘bad’ when it leads to unfortunate outcomes, such as loss of life or destruction of property, but the hurricane itself would not be in any sense morally blameworthy and so it is ill-fitting to treat it as such. To that end, AI is merely a tool. It is the way in which humans use their tools that have moral implications. For example, while a knife by itself is just an object, it is nevertheless a potentially harmful object. Therefore, placing a knife carelessly in a room where children could easily get hurt is a fundamentally immoral course of action. This is the mindset we should adopt with AI: we must put the knife in the proper drawer, so to speak, so that we ensure no one is getting cut. So how do we put this into practice?

The answer of course will depend on the AI’s purpose, so the correct human-to-AI interaction is attained with autonomy in mind. Consider the HR sector- nugget.ai’s area of focus. Nugget's screening engine may be utilized for the purpose of talent acquisition. Through scientifically backed models, the engine analyzes a given pool of candidates and generates recommendations and insights for the user. Naturally, a structure unfolds. The screening engine becomes a tool to inform decisions. Informing decisions is much different than forcing decisions. That is why users can access all relevant candidate information, manually shortlist candidates when necessary, and so on- all this being conducive to augmenting our decision-making abilities. With AI on our side, we are able to confide in deep insights we otherwise would not have access to, yet we remain able to holistically consider all options and make the most informed decisions possible. This example being on nugget.ai’s side of things, we can similarly imagine how AI in other sectors would continue to demand varying degrees of autonomy (e.g. automated, predictable activities we may let the AI handle exclusively, whereas more contextual decisions with unpredictable consequences would require more measures to restrict the AI’s ability to act).

Privacy

The topic of privacy has seen quite the limelight in recent times. Irrespective of any personal beliefs whether user data should be a commodity or not, the fact of the matter is there is a security issue at the root of the problem. Patricia Thaine’s article Why is Privacy-Preserving Natural Language Processing Important? Highlights these key concerns. As such, a social responsibility is placed on a company whenever they are given access to your data; to pretend otherwise would be nothing short of negligence. Further considerations include what is the kind of information being collected, and for what purpose. A substantial argument is to be made that information should be used to improve our understanding of the world to make it better. Companies with psychometric-based AI systems have begun cropping up with the goal of detecting employee mental wellbeing. Given consent and sufficient privacy measures (e.g. not releasing or storing sensitive information on those being evaluated), it appears to be a good idea to allow such systems to inspect users, whilst maintaining the highly necessary standard of respecting individual’s privacy.

Security measures luckily are in reach: a host of encryption methods exist to help keep data private. At nugget, this is essential, for the sensitive data collected from both potential employee candidates and the current workforce must be handled with due care. That’s why both AES-256 and TLS systems are in place to help achieve this end.

Transparency

Of course, the inverse of privacy would be transparency: some things must be private, and some things must be public. Transparency helps with accountability, deliberation, and benchmarking. In the same way in accounting we want financial transparency, the same is true for our utilization of our technology. On one side of things, it is important that the AI does its best to deliver easy-to-digest numbers for users to interpret. Scores are an excellent way to rank options, hence concepts like the nugget score are powerful tools used to sum up complex analyses into overall results to act upon. There is another side of things where the specific use of the program should be documented. This would be in line with having a performance overview, simply from the viewpoint of how one utilizes the AI handed to them. Making executive decisions to either accept or go against an AI’s recommendation ought to be well understood and known. It helps companies stay accountable for the use of their technologies, as well as giving the creators of the software be able to go back and see if tweaks can be made to the AI. This therefore shows us how we can benchmark our AI, seeing if the AI makes consistent errors in one area, or delivers correct and strong results. All of this is possible with transparent reporting structures.

To that end, the transparency of an AI’s reasoning is another aspect to this consideration. Again, consider the nugget score. Such a score is not only given to you, but additionally, the rationale is provided, showing you benchmarks and statistical breakdowns of how every candidate did on the challenges they’ve completed. The overall idea is this: the more AI can communicate to its user, the better. As AI gets more complex, it must have a means to report its findings to users in an easy to digest way.

Proper Data Infrastructure

The last topic here is thankfully a relatively simple idea: ensure data is collected, parsed, and processed correctly. The scientific process is relatively good at this; valid empirical findings organized correctly makes for potent and useful results, this concept is as basic as science gets. AI simply demands proper labelling of whatever the given taxonomy is for proper execution of its processes. The idea is obvious, but nevertheless extremely important. Say an AI dedicated to diagnosing ailments in patients had incorrect information fed into its database (e.g. incorrect symptoms for certain diseases, the inability to parse colours correctly, etc.) the findings generated by that AI would be potentially catastrophic, recommending the wrong treatment and placing confusion among those who need to read and interpret these results. That is why strict adherence to all the fortunes afforded by data science will reduce potentiality for error. What follows is not only proper labelling, but also the need for substantial and robust population sets. No algorithm is biased by virtue of its constructed makeup unless it was deliberately coded to be such a way. Yet, we find AI systems being biased anyhow. The reason is simple: the bias lies in the data set fed into the training phase. If sampling bias is inherent in your training set, the results will therefore be biased in equal part.

To learn more about proper data infrastructure click here for part 2!

Nicholas Tessier 🧠

Product Manager