1.1. Introduction
1.2. Big Data and Bias
1.3. Machine Injustice
1.4. Black Boxes
1.5. When Models Fail
1.6. Ethical Algorithms
Why you should watch these lectures/read the notes!
Sure its not on the exam... but you want to be a good person as well as land a job right?
Machine Learning Scientist - Recruiting Technology Scotland, Edinburgh&utm_source=indeed.com&utm_campaign=all_amazon&utm_medium=job_aggregator&utm_content=organic&dclid=CjgKEAjwjPaCBhDSyYfW-f2jzysSJABBs5ZoZ968EhXTmHisT_mWFFcrY734vNMfD_dUYICkkpi_6vD_BwE)
"...Our ideal candidate is an experienced ML scientist who has a track-record of statistical analysis and building models to solve real business problems, who has great leadership and communication skills, and who has a passion for fairness and explainability in ML systems."
Tenure-Track/Tenured Professor of Computer Science - Artificial Intelligence
"...we are particularly interested in machine learning; natural language processing; information retrieval; human-computer interaction; vision; fairness, accountability, transparency, and justice in AI."
Notes
The Navy revealed the embryo of an electronic computer today that it expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.
July 8, 1958, The New York Times
The embryo in question is a perceptron, a simple logical circuit designed to mimic a biological neuron.
It takes a set of numerical values as inputs, and then spits out either a 0 or a 1.
Notes2
Connect enough of these perceptrons together in the right ways, and you can build:
Though the computer hardware is vastly more powerful, the basic approach remains similar to how it was a half century ago.
from IPython.display import HTML
# https://www.theguardian.com/technology/2016/mar/15/alphago-what-does-google-advanced-software-go-next
# https://www.bestcriminaldefencebarrister.co.uk/criminal-defence-barrister-blog/2019/self-driving-cars-and-the-law-more-problems-than-answers-barristers-perspective/
display(HTML("<center><table><tr><td><img src='./Images/alphago.jpg', width='900'></td><td><img src='./Images/Driverless-cars-graphic-wr.jpg', width='850'></td></tr></table></center>"))
The hype hasn’t diminished2.
...will make possible a new generation of artificial intelligence [AI] systems that will perform some functions that humans do with ease: see, speak, listen, navigate, manipulate and control.
December 28, 2013, The New York Times
Notes
Advances in AI are great and are spurring a lot of economic activity. However there is currently unreasonable expectations, which drives2:
"Policy makers [are] earnestly having meetings to discuss the rights of robots when they should be talking about discrimination in algorithmic decision making."
Zachary Lipton, AI researcher at Carnegie Mellon University
Notes
"[AI poses a] fundamental risk to the existence of human civilization." Elon Musk, 2017
Compared to the human brain, machine learning isn’t especially efficient.
A machine learning program requires millions or billions of data points to create its statistical models.
Its only now those petabytes of data are now readily available, along with powerful computers to process them13.
Notes
Extra: Facebook Inventing Skynet?
"AI Is Inventing Languages Humans Can’t Understand. Should We Stop It?" Fast Company article
BOB THE BOT: "I can can I I everything else."
ALICE THE BOT: "Balls have zero to me to me to me to me to me to me to me to me to."
BOB: "You I everything else."
ALICE: "Balls have a ball to me to me to me to me to me to me to me to me."
The original Facebook blog post simply described a chatbot evolving the repetition of nonsensical sentences, which was dramatically distorted to a story about saving the human race.
"There was no panic," one researcher said, "and the project hasn’t been shut down."
Notes
For many jobs, machine learning proves to be more flexible and nuanced than the traditional programs governed by rules13.
Rosenblatt deserves credit because many of his ambitious predictions have come true:
...are all built using perceptron-like algorithms2.
Most of the recent breakthroughs in machine learning are due to the masses of data available and the processing power to deal with it, rather than a fundamentally different approach.
Notes
We live in an ever increasing quantified world, where everything is counted, measured, and analyzed2:
We've also moved from companies paying customers to complete surveys to them recording what we do2.
from IPython.display import HTML
# https://fitday.com/fitness-articles/fitness/5-tricks-to-boost-your-daily-step-count.html
# https://thecabinetmarket.com/design-advice/are-smart-appliances-the-right-choice-for-your-kitchen/
display(HTML("<center><table><tr><td><img src='./Images/steps.jpg'></td><td><img src='./Images/iStock-1132781699.jpg', width='522'></td></tr></table></center>"))
Notes
"Data collection is a big business. Data is valuable: “the new oil,” as the Economist proclaimed. We’ve known that for some time. But the public provides the data under the assumption that we, the public, benefit from it. We also assume that data is collected and stored responsibly, and those who supply the data won’t be harmed."14
What do they know2?
Mathematicians and statisticians use this data to study our desires, movements, and spending power.
This is the "Big Data economy", and it promises spectacular gains.
Algorithms not only save time and money but are "fair" and "objective".
Numbers and data suggest precision and imply a scientific approach, appearing to have an existence separate from the humans reporting them.
display(HTML("<center><img src='./Images/Calvin_Hobbes_Data_Quality.gif', width='800'></center>"))
Notes
Models don't involve prejudiced humans, just machines processing cold numbers right?13.
Numbers feel objective, but are easily manipulated.
"It’s like the old joke:2
A mathematician, an engineer, and an accountant are applying for a job. They are led to the interview room and given a math quiz. The first problem is a warm-up: What is 2 + 2? The mathematician rolls her eyes, writes the numeral 4, and moves on. The engineer pauses for a moment, then writes “Approximately 4.” The accountant looks around nervously, then gets out of his chair and walks over to the fellow administering the test. “Before I put anything in writing,” he says in a low whisper, “what do you want it to be?
Algorithms can go wrong and be damaging just due to simple human incompetence or malfeasance.
But also the fault could be:
In these (worryingly common) cases, it does not matter how expertly and carefully the algorithms are implimented.
Notes
"No algorithm, no matter how logically sound, can overcome flawed training data."2
Good training data is difficult and expensive to obtain.
Training data often comes from the real world, but the real world is full of human errors and biases.
Notes
As exact counts and exhaustive measurements are nearly always impossible, we take small samples of a larger group and using that information to make broader inferences.
Example2
"If one measured only a half dozen men and took their average height, it would be easy to get a misleading estimate simply by chance. Perhaps you sampled a few unusually tall guys. Fortunately, with large samples things tend to average out, and sampling error will have a minimal effect on the outcome."
This is more of a systematic error caused by or measurement method.
Example2
"Researchers might ask subjects to report their own heights, but men commonly exaggerate their heights—and shorter men exaggerate more than taller men."
The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.
Donald Campbell
The algorithms these measures are used as input for, in turn, can modify behaviour.
Example13
Standardized testing can be valuable indicators of general school achievement under normal teaching conditions.
But when test scores become the goal of teaching, they both lose their value as indicators and distort the educational process.
Notes
Selection bias arises when sampled individuals differ systematically from the population eligible for your study.
Example2
"Suppose you decide to estimate people’s heights by going to the local basketball court and measuring the players. Basketball players are probably taller than average, so your sample will not be representative of the population as a whole, and as a result your estimate of average height will be too high."
What you see depends on where you look.
Example2
"People turn to Google when looking for help, and turn to Facebook to boast."
display(HTML("<center><table><tr><td><img src='./Images/husband_fb.png'></td><td><img src='./Images/husband_gl.png'></td></tr></table></center>"))
Outliers can significantly skew data.
In some data they are naturally part of what you are measuring, but need to be intepreted appropriately and accounted for.
Example3
When analyzing income in the United States, there are a few extremely wealthy individuals whose income can influence the average income. For this reason, a median value is often a more accurate representation of the larger population.
When mistakes appear in data even the best-designed algorithms will make the wrong decision.
Statisticians count on large numbers to balance out exceptions and anomalies in data, but that means they punish individuals who happen to be the exception13:
Notes
There are tonnes of human biases that have been defined and classified by psychologists; each affecting individual decision making.
These include feelings towards a person based on their perceived group membership.
These biases could seep into machine learning algorithms via either4:
Studies23,24 have demonstrated word embedding (e.g. word2vec) reflects, and perhaps amplifies, the biases already present in training documents.
This was also discovered by Amazon when building a machine learning model to evaluate the resumes of candidates for software engineering jobs25.
Its not suggested these are a result of bias at amazon or google, rather this bias was the unexpected outcome of careful application of rigorous and principled machine learning methodology to massive, complex, and bias datasets15.
Notes
"Machines are not free of human biases; they perpetuate them, depending on the data they’re fed."2
Despite appearing impartial, models reflect goals and ideology13.
Our values and desires influence our choices, from the data we choose to collect to the questions we ask13.
Notes
As good training data is hard to come by, it is often the case we lack the data for the behaviors they’re most interested in classifying/predicting. Therefore proxies are used instead.
However, proxies are easier to manipulate than the complicated reality they represent13.
Example
We may want to develop a model that can predict whether someone will pay back a loan or handle a job.
As this is a prediction about something that may happen in the future we don't know the outcome yet, so we may be tempted to include factors such as a person’s postcode or language patterns.
Even if we do not use "race" as a varible in our models, as society is largely segregated by geography, this is a highly effective proxy for race13.
Notes
Criminal Sentencing: Algorithms identify black defendants as "future" criminals at nearly twice the rate as white defendants, which leads to differences in pretrial release, sentencing, and parole deals2.
Deployment of Police Officers: Predictive programs, like PredPol and HunchLab, that position cops where crimes are most likely to appear, create a pernicious feedback loop. The policing spawns new data, which justifies more policing and prisons fill up with people from impoverished neighborhoods13.
Interest Rates: Algorithmic lenders charge higher interest rates to both black and Latino applicants2
Hiring Software: As discussed, automated hiring software have preferentially selected men over women2
College (University) Admissions: Predictive analytics packages (e.g. ForecastPlus, RightStudent) gathers and sells data to help colleges target the most promising candidates for recruitment; including students who can pay full tuition, eligiblility for scholarships, learning disability ect.2,13
Notes
Criminal Sentencing
"racism is the most slovenly of predictive models. It is powered by haphazard data gathering and spurious correlations, reinforced by institutional inequities, and polluted by confirmation bias."13
"The question, however, is whether we’ve eliminated human bias or simply camouflaged it with technology. The new recidivism models are complicated and mathematical. But embedded within these models are a host of assumptions, some of them prejudicial."13
"This is the basis of our legal system. We are judged by what we do, not by who we are. And although we don’t know the exact weights that are attached to these parts of the test, any weight above zero is unreasonable."13
"sentencing models that profile a person by his or her circumstances help to create the environment that justifies their assumptions."13
"The penal system is teeming with data, especially since convicts enjoy even fewer privacy rights than the rest of us. What’s more, the system is so miserable, overcrowded, inefficient, expensive, and inhumane that it’s crying out for improvements. Who wouldn’t want a cheap solution like this?"13
Deployment of Police Officers
If the algorithms were trainined on white collar crimes they would focus on very different areas of their community.
"police make choices about where they direct their attention. Today they focus almost exclusively on the poor... And now data scientists are stitching this status quo of the social order into models...we criminalize poverty, believing all the while that our tools are not only scientific but fair."13
"maybe we wish to predict crime risk, but we don’t have data on who commits crimes—we only have data on who was arrested. If police officers already exhibit racial bias in their arrest patterns, this will be reflected in the data."15
"Sometimes decisions made using biased data or algorithms are the basis for further data collection, forming a pernicious feedback loop that can amplify discrimination over time. An example of this phenomenon comes from the domain of “predictive policing,” in which large metropolitan police departments use statistical models to forecast neighborhoods with higher crime rates, and then send larger forces of police officers there."15
Interest Rates
Hiring Software
College (University) Admissions
WMD's, as defined by Cathy O'Neil, have three elements: Opacity, Scale, and Damage.
Opacity
"WMDs are, by design, inscrutable black boxes. That makes it extra hard to definitively answer the second question: Does the model work against the subject’s interest? In short, is it unfair? Does it damage or destroy lives?"13
Scale
"A formula...might be perfectly innocuous in theory. But if it grows to become a national or global standard, it creates its own distorted and dystopian economy."13
Damage
"They define their own reality and use it to justify their results. This type of model is self-perpetuating, highly destructive—and very common."13
Notes
Opacity
Scale
Damage
"To disarm WMDs, we...need to measure their impact and conduct algorithmic audits. The first step, before digging into the software code, is to carry out research. We’d begin by treating the WMD as a black box that takes in data and spits out conclusions."
Cathy O'Neil
Most often, problems arise either because there are biases in the data, or because there are obvious problems with the output or its interpretation.
Only ocasionally the technical details of the black box matter to spot issues.
Notes
"On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' … I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."
Charles Babbage, "father of the computer"
As data is so central to these systems, to spot problems we can start by looking at the training data and the labels.
GIGO: garbage in, garbage out.
Check: Is the data unbiased, reasonable, and relevant to the problem at hand?2
Notes
"extraordinary claims require extraordinary evidence."2
Logical Checks2
Show how someones assumptions can lead to ridiculous conclusions.
Example: Momentous sprint at the 2156 Olympics?5
Sir—A. J. Tatem and colleagues calculate that women may outsprint men by the middle of the twenty-second century (Nature 431, 525; 200410.1038/431525a). They omit to mention, however, that (according to their analysis) a far more interesting race should occur in about 2636, when times of less than zero seconds will be recorded. In the intervening 600 years, the authors may wish to address the obvious challenges raised for both time-keeping and the teaching of basic statistics.
Ken Rice, Biostatistics Professor
Notes
"If someone claims that A implies B, find a case in which A is true but B is not."2
Math is generally pretty good to use for counterexamples, although logical thinking also works well.
Example2
Fermat’s last theorem (more of a conjecture due to lack of proof) was that there are no three distinct integers $a$, $b$, and $c$ such that $a^n + b^n = c^n$ for integer values of $n$ greater than 2. This was attempted to be proved for centuries (e.g. Andrew Wiles).
It was later generalized by eighteenth-century mathematician Leonhard Euler into the sum of powers conjecture: for integers $a, b, c, \ldots, z$ and any integer $n$, if you want numbers $a^n, b^n, c^n, \ldots$, to add to some other number $z^n$, you need at least $n$ terms in the sum. Again time passed with no way of proving or disproving this, until 1966 when two mathematicians used an early computer to run through a huge list of possibilities and found the counterexample below:
$$27^5 + 84^5 + 110^5 + 133^5 = 144^5$$27**5 + 84**5 + 110**5 + 133**5 == 144**5
"The point of a null model is not to accurately model the world, but rather to show that a pattern X, which has been interpreted as evidence of a process Y, could actually have arisen without Y occurring at all".2
Example6
The following is a plot intended to demonstrate how as we age our physical and cognative abilities decline.
Notes
"We might see the same decreasing trend in speed simply as a consequence of sample size, even if runners did not get slower with age."2
More people run competitively in their twenties and thirties than in their seventies and eighties. The more runners you sample from, the faster you expect the fastest time to be.
Notes2
This does not mean that senescence is a myth, this just means this is not compelling evidence, because the null model shows the same result without senescence2.
Other valid objections include:
Lets put the ideas into practice on an ML paper.
Automated Inference on Criminality Using Face Images8
Unlike a human examiner/judge, a computer vision algorithm or classifier has absolutely no subjective baggages [sic], having no emotions, no biases whatsoever due to past experience, race, religion, political doctrine, gender, age, etc., no mental fatigue, no preconditioning of a bad sleep or meal. The automated inference on criminality eliminates the variable of meta-accuracy (the competence of the human judge/examiner) all together.8
The problem with this new study can be identified in the training data and can be reasoned using a null model.
Notes
Training Data: The criminal faces used to train the algorithm were seldom smiling, whereas the noncriminal faces were usually smiling.
Null Model: Could we get the same result by training a model that only identifies smiling? Most likely.
Notes
A number of ML algorithms create their own rules to make decisions—and these rules often make little sense to humans2.
Sometimes these rules can be fooled surprisingly easily (Aversarial attacks)28,29.
Notes
Sometimes the rules ML models use focus on unintended aspects of the training data.
Example10
Ribeiro et al. (2016) developed an automated method for distinguishing photographs of huskies from wolves.
By looking at the errors (e.g. where a husky is classified as a wolf), they demonstrated the importance of looking at what information the algorithm was using.
display(HTML("<table><tr><td><img src='./Images/husky_wolf.jpg', width='522'></td><td><img src='./Images/explain.jpg', width='522'></td></tr></table>"))
Extra Example
"John Zech and colleagues at California Pacific Medical Center wanted to investigate how well neural networks could detect pathologies such as pneumonia and cardiomegaly—enlarged heart—using X-ray images. The team found that their algorithms performed relatively well in hospitals where they were trained, but poorly elsewhere....It turned out that the machine was cueing on parts of the images that had nothing to do with the heart or lungs. For example, X-ray images produced by a portable imaging device had the word PORTABLE printed in the upper right corner—and the algorithm learned that this is a good indicator that a patient has pneumonia. Why? Because portable X-ray machines are used for the most severely ill patients, who cannot easily go to the radiology department of the hospital. Using this cue improved prediction in the original hospital. But it was of little practical value. It had little to do with identifying pneumonia, didn’t cue in on anything doctors didn’t already know, and wouldn’t work in a different hospital that used a different type of portable X-ray machine."2
Notes
Notes
Complicated models do a great job of fitting the training data, but simpler models often perform better on the test data.
The hard part is figuring out just how simple of a model to use.2
Example: Detecting Influenza Epidemics Using Search Engine Query Data11
A method for predicting flu outbreaks based on Google search queries.
Notes
It worked well for a few years but the results started to miss the mark by a factor of two and it was eventually axed.
"There was no theory about what search terms constituted relevant predictors of flu, and that left the algorithm highly susceptible to chance correlations in timing."2
"When the frequency of search queries changes, the rules that the algorithm learned previously may no longer be effective."2
"A...likely culprit is changes made by Google’s search algorithm itself. The Google search algorithm is not a static entity—the company is constantly testing and improving search"12
Many complicated algorithms use hundreds of variables when making predictions.
If you add enough variables into your black box, you will eventually find combinations that perform well — but it may do so by chance.
As you increase the number of variables you use to make your predictions, you need exponentially more data to distinguish true predictive capacity from luck.
Notes
Models may work well on a small scale, but lead to unintended consequences when deployed at a larger scale.
Example: Navigation Apps15
Navigation apps can cause increased congestion and delays on side streets30.
Game theorists may call this a "bad equilibrium".
It creates its own strange insentives, people deliberately send missinformation about traffic jams and accidents so it will steer commuters elsewhere.
Example: Recommender Systems15
When our collective data is used to estimate a small number of user types, and you are assigned to one of them, the suggestions you see are narrowed by the scope of the model created, which is a function of everyone else’s activity.
People still have free will and can choose to ignore recommendations but the more of us that adopt their suggestions, the more our collective behavior is influenced or even determined by them.
Amazon Product Recommendations
Facebook Newsfeed
# https://chatbotsmagazine.com/are-chatbots-the-future-of-product-recommendation-eeeb4cfa3138
# https://www.bbc.co.uk/news/world-australia-56165015
display(HTML("<table><tr><td><img src='./Images/amazon.png'></td><td><img src='./Images/facebookNews.png'></td></tr></table>"))
If you torture the data for long enough, it will confess to anything.
Ronald Coase (1960s)
In a lot of sciences (including ML research) there is a "Repoducability crisis"32
False discovery in the sciences comes from scale (quantity) of research that is conducted over and over again on the same datasets, and the selectivity of sharing results31.
Harms
Solutions
Extra Example: Overfitting to Validation/Test set15
Kaggle competitions makes machine learning a competative sport.
In the 2015 ImageNet competition Baidu, a chinese search engine, announced they had made better progress on this competition than its competitors Google. But it turned out they had cheated.
Competitors were allowed to check their models on a validation set two times per week to mitigate the leaking of information for overfitting, whilst still allowing competitors know how well they are doing.
Baidu had created thirty fake accounts and tested their model on the validation set more than 200 times. They therefore were gradually fitting to the validation set better and better, which improved their rank on the leaderboard.
- its impossible to tell if their model was actually getting better in general or just better on the validation set.
They had to withdraw for a year and the team leader was fired.
Notes
Notes
"Moving fast and breaking things is unacceptable if we don’t think about the things we are likely to break."14
There are a number of ethical guidelines you can follow from the...
Notes
Example Checklist14
❏ Have we listed how this technology can be attacked or abused?
❏ Have we tested our training data to ensure it is fair and representative?
❏ Have we studied and understood possible sources of bias in our data?
❏ Does our team reflect diversity of opinions, backgrounds, and kinds of thought?
❏ What kind of user consent do we need to collect to use the data?
❏ Do we have a mechanism for gathering consent from users?
❏ Have we explained clearly what users are consenting to?
❏ Do we have a mechanism for redress if people are harmed by the results?
❏ Can we shut down this software in production if it is behaving badly?
❏ Have we tested for fairness with respect to different user groups?
❏ Have we tested for disparate error rates among different user groups?
❏ Do we test and monitor for model drift to ensure our software remains fair over time?
❏ Do we have a plan to protect and secure user data?
There are a number of ways we can try mitigate the issues of ML models and pipelines in practice. These include:
"A more diverse AI community will be better equipped to anticipate, spot, and review issues of unfair bias and better able to engage communities likely affected by bias."16
A diverse AI community aids in the identification of bias.
The AI field currently does not encompass society’s diversity, including on gender, race, geography, class, and physical disabilities.
Notes
"anonymized data isn't"
Synthia Dwork
There are many well known examples where "anonymized" datasets have been "de-anonymized" or led to unintended consequences14,15,22.
Notes
Redact information so that no set of characteristics matches just a single data record.
There are two main ways to make it k-anonymous15:
Limitations
Notes
"Tore Dalenius defined in 1977 as a goal for statistical database privacy: that nothing about an individual should be learnable from a dataset if it cannot also be learned without access to the dataset...we ask for a refinement of Dalenius’s goal: that nothing about an individual should be learnable from a dataset that cannot be learned from the same dataset but with that individual’s data removed."15
Uses randomness to deliberately add noise to computations so that any one person’s data cannot be reverse-engineered from the results. Since we know the process by which errors have been introduced, we can work backward to deduce approximately the fraction of the population for whom the truthful answer is yes.
centralized privacy: The privacy is added on the “server” side.
local privacy: The privacy is added on the “client” side.
Companies using Differential privacy
Limitations
Notes
"treat others' data as you would have others treat your own data."14
Notes
"implementing a golden rule in the actual research and development process is challenging..."14
"Most Twitter users know that their public tweets are, in fact, public; but many don’t understand that their tweets can be collected and used for research, or even that they are for sale."14
We can use GANS to generate realistic medical records, while preserving the identity of the real patient data used to train those GANS15
Are we willing to sacrifice a bit of efficiency and accuracy in the interest of fairness? Should we handicap the models?
Example: Credit Scoring13
FICO scores
e-scores
Access data on web browsing, purchasing patterns, location of the visitor’s computer, real estate data, for insights about the potential customers wealth.
Bad Qualities
Notes
"Programmers don’t know how to code for [fairness], and few of their bosses ask them to."13
We can try to "...encode ethical principles directly into the design of the algorithms."15
Typically there is a focus on algorithmic trade-offs on performance metrics such as computational speed, memory requirements, and accuracy, but there is emerging research into how "fairness" can be included as metrics when considering algorithms.
Notes
Definitions of fairness, privacy, transparency, interpretability, and morality remain in the human domain, and require a multidisciplatory team to collaberate on to define in a quantitative definition.
These new goals can be used as constraints on learning. They have an associated costs: "If the most accurate model for predicting loan repayment is racially biased, then, by definition, eradicating that bias results in a less accurate model."15
"It’s easy to say that applications shouldn’t collect data about race, gender, disabilities, or other protected classes. But if you don’t gather that data, you will have trouble testing whether your applications are fair to minorities. Machine learning has proven to be very good at figuring its own proxies for race and other classes. Your application wouldn’t be the first system that was unfair despite the best intentions of its developers. Do you keep the data you need to test for fairness in a separate database, with separate access controls?"14
"No matter what things we prefer (or demand) that algorithms ignore in making their decisions, there will always be ways of skirting those preferences by finding and using proxies for the forbidden information."15
We can instead attempt to define fariness relative to our model predictions.
Machine learning models are only good at optimize what we ask them to optimize, if we don't mention fairness, we won’t get fairness.
There are many ways we can define fairness, with these methods often conflicting.
Notes
A simple notion of fairness that asks that the fraction of "Square" applicants that are granted loans be approximately the same as the fraction of "Circle" applicants that are granted loans.
Advantages
Limitations
Notes
Rather than evenly distributing the loans we give, we could require that we evenly distribute the mistakes we make (e.g. false rejections).
Advantages
Limitations
Notes
"constitutes the set of “reasonable” choices for the trade-off between accuracy and fairness."15
"The Pareto frontier of accuracy and fairness is necessarily silent about which point we should choose along the frontier, because that is a matter of judgment about the relative importance of accuracy and fairness. The Pareto frontier makes our problem as quantitative as possible, but no more so."15
"So in choosing a point on the Pareto frontier for a lending algorithm, we might prefer to err strongly on the side of fairness—for example, insisting that the false rejection rate across different racial groups be very nearly equal, even at the cost of reducing bank profits. We’ll make more mistakes this way—both false rejections of creditworthy applicants and loans granted to parties who will default—but those mistakes will not be disproportionately concentrated in any one racial group."15
"Mathematical models should be our tools, not our masters."13
We need human values intergrated into these systems, even at the cost of efficiency.
Big Data processes codify the past, they do not invent the future.
"decisions should be informed by many factors that cannot be made quantitative, including what the societal goal of protecting a particular group is and what is at stake."15
There are no universal answers15:
Today companies like Google, which have grown up in an era of massively abundant data, don't have to settle for wrong models. Indeed, they don't have to settle for models at all...Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves. Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.
Chris Anderson, Wired 200817
"Big data" and "machine learning", in most instances, should compliment rather than supplement work by humans.
For open-ended tasks involving judgment and discretion, there is still no substitute for human intervention.
Are there some things we want to formalise and remove the human element?
Notes
"Identifying fake news, detecting sarcasm, creating humor—for now, these are areas in which machines fall short of their human creators. However, reading addresses is relatively simple for a computer. The digit classification problem—figuring out whether a printed digit is a one, a two, a three, etc.—is a classic application of machine learning."2
Humans can be seen as throwbacks in the data econemy - inefficient and costly.
Any statistical program has errors, but why not just get humans to work on fine-tuning the algorithms?
Automatic systems urgently require the context, common sense, and fairness that only humans can provide; esspecially when faced with error-ridden data13.
"We particularly need to think about the unintended consequences of our use of data."14
Machines lack moral imagination; that’s something only humans can provide13.
Human decision making, while often flawed, can evolve. Automated systems stay stuck in time until engineers dive in to change them13.
Trustworthy models maintain a constant back-and-forth with what they are trying to understand or predict. As conditions change, so must the model13.
Operational strategies for businesses can include16:
Notes
Algorithmic transparency is the principle that people affected by decision-making algorithms should have a right to know why the algorithms are making the choices that they do.2
Transparency about processes and metrics helps us understand the steps taken to promote fairness and associated trade-offs16.
Auditors face resistance from the web giants
Researchers however are moving forward with auditing, such as the "Web Transparency and Accountability Project".
Notes
Most machine learning algorithms are (broadly) simple and principled, but the models output by them can be extremely difficult to understand as they capture complex relationships between varibles - simple algorithms applied to simple data can lead to complex models.
Interpretable to whom? Different groups of people may use this algorthm so the definition of interpretable may change between them15.
What is interpretable?
Notes
Algorithmic accountability is the principle that firms or agencies using algorithms to make decisions are still responsible for those decisions, especially decisions that involve humans. We cannot let people excuse unjust or harmful actions by saying “It wasn’t our decision; it was the algorithm that did that.”2
If platform companies, app developers, and government agencies don’t care about privacy or fairness, there can be an insentive to ignore transparency without accountability.
The European Union’s General Data Protection Regulation
The tech industry itself is starting to develop self-regulatory initiatives of various types, such as the Partnership on AI to Benefit People and Society15.
Notes
"We need to rethink the entire stack — from software to hardware, Deep learning has made the recent AI revolution possible, but its growing cost in energy and carbon emissions is untenable."
Aude Oliva, director of the MIT-IBM Watson AI Lab
Current ML practice is rapidly becoming economically, technically, and environmentally unsustainable26.
"The challenge for data scientists is to understand the ecosystems they are wading into and to present not just the problems but also their possible solutions."13
"Sometimes the job of a data scientist is to know when you don’t know enough. As I survey the data economy, I see loads of emerging mathematical models that might be used for good and an equal number that have the potential to be great—if they’re not abused. Consider the work of Mira Bernstein, a slavery sleuth. A Harvard PhD in math, she created a model to scan vast industrial supply chains, like the ones that put together cell phones, sneakers, or SUVs, to find signs of forced labor. She built her slavery model for a nonprofit company called Made in a Free World. Its goal is to use the model to help companies root out the slave-built components in their products...Like many responsible models, the slavery detector does not overreach. It merely points to suspicious places and leaves the last part of the hunt to human beings."13
"Another model for the common good has emerged in the field of social work. It’s a predictive model that pinpoints households where children are most likely to suffer abuse. The model, developed by Eckerd, a child and family services nonprofit in the southeastern United States, launched in 2013 in Florida’s Hillsborough County, an area encompassing Tampa...It funnels resources to families at risk. "13
"Technologically, the same artificial intelligence techniques used to detect fake news can be used to get around detectors, leading to an arms race of production and detection that the detectors are unlikely to win."2
scikit-learn
you may want to look at some related python projects: https://scikit-learn.org/stable/related_projects.html#related-projectsRecommended Lectures
("Guest" Lectures in the age of COVID)
My Current Reading List
import sys
from shutil import copyfile
import os
# where the HTML template is located
dst = os.path.join(sys.prefix, 'lib', 'site-packages', 'nbconvert', 'templates', "classic.tplx")
# If its not located where it should be
if not os.path.exists(dst):
# uses a nb_pdf_template
curr_path = os.path.join(os.getcwd(),"..", "Extra", "classic.tplx")
# copy where it is meant to be
copyfile(curr_path, dst)
# Create HTML notes document
!jupyter nbconvert 1_Big_Data_Black_Boxes_and_Bias.ipynb \
--to html \
--output-dir . \
--template classic
!jupyter nbconvert 1_Big_Data_Black_Boxes_and_Bias.ipynb \
--to slides \
--output-dir . \
--TemplateExporter.exclude_input=True \
--TemplateExporter.exclude_output_prompt=True \
--SlidesExporter.reveal_scroll=True
# Create pdf notes document (issues)
!jupyter nbconvert 1_Big_Data_Black_Boxes_and_Bias.ipynb \
--to html \
--output-dir ./PDF_Prep \
--output 3_Applications_no_code \
--TemplateExporter.exclude_input=True \
--TemplateExporter.exclude_output_prompt=True