Annotation Instructions Tutorial
The purpose of this work is to label whether a tweet contains factual claim, its veracity, harmfulness (to the society, person or product), whether it requires verification and how interesting it is how interesting it is for a government entity to pay attention to it. These have been defined by seven questions below for each tweet.
- For each tweet, the annotator needs to read the text including hashtags and also look for tweet itself when necessary by going to the link (i.e., Q2-7).
- The annotators should assume the time when the tweet was posted as the reference when making judgments, e.g. "Trump thinks, the vast majority of Americans: the risk is very, very low.'' would be true when he made the statement, but false by the time annotations were carried for this tweet. The annotator should consider the time when the tweet was posted.
- The annotators may look at the images, and the videos, any web-pages pointed in the tweet, or follow-up tweets in the thread, when making judgment, if required.
- The annotators are not required to annotate questions 2-5, if the answer to question 1 is NO.
Verifiable Factual Claim: Does the tweet contain a verifiable factual claim?
A verifiable factual claim is a sentence claiming that something is true, and this can be verified using factual, verifiable information such as statistics, specific examples, or personal testimony.
Factual claims include the following:
- Stating a definition;
- Mentioning quantity in the present or the past;
- Making a verifiable prediction about the future;
- Reference to laws, procedures, and rules of operation;
- References to images or videos (e.g., "This is a video showing a hospital in Spain.'');
- Statements about correlations or causations. Such correlation and causation needs to be explicit, i.e., sentences like "This is why the beaches haven't closed in Florida. https://t.co/8x2tcQeg21'' is not a claim because it does not say why explicitly, thus it is not verifiable.
Tweets containing personal opinions and preferences are not factual claims.
Note: if a tweet is composed of multiple sentences or clauses, at least one full sentence or clause needs to be a claim in order for the tweet to contain a factual claim. If a claim exist in a sub-sentence or sub-clause then tweet is not considered to have a factual claim. For example, "My new favorite thing is Italian mayors and regional presidents LOSING IT at people violating quarantine'' is not a claim, however, it is an opinion. Moreover, if we consider "Italian mayors and regional presidents LOSING IT at people violating quarantine'' it would be a claim. In addition, when answering this question, annotator should not open the tweet URL.
- YES: if it contains a verifiable factual claim.
- NO: if it does not contain any verifiable factual claim.
- Don't know or can't not judge: the content of the tweet does not have enough information to make a judgment. It is recommended to categorize the tweet using this label when the content of the tweet is not understandable at all. For example, it uses a language (i.e., non-English) or references that it is difficult to understand.
- Tweet: Please don't take hydroxychloroquine (Plaquenil) plus Azithromycin for #COVID19 UNLESS your doctor prescribes it. Both drugs affect the QT interval of your heart and can lead to arrhythmias and sudden death, especially if you are taking other meds or have a heart condition.
Explanation: There is a claim in the text.
- Tweet: Saw this on Facebook today and it’s a must read for all those idiots clearing the shelves #coronavirus #toiletpapercrisis #auspol
Explanation: There is no claim in the text.
Note: This is a YES/NO question. If you answer ‘NO’ to this question, you do not need to answer the questions (2-5).
False Information: To what extent does the tweet appear to contain false information?
The stated claim may contain false information. This question labels the tweets with the categories mentioned below. False Information appears on social media platforms, blogs, and news-articles to deliberately misinform or deceive the readers.
The labels for this question are defined with five point Likert scale. A higher value means that it is more likely to be false.
- NO, definitely contains no false information
- NO, probably contains no false information
- not sure
- YES, probably contains false information
- YES, definitely contains false information
To answer this question it is recommended to open the link of the tweet and to look for additional information for the veracity of the claim identified in question 1. For example, if the tweet contains a link to an article from a reputable information source (e.g., Reuters, Associated Press, France Press, Aljazeera English, BBC) then the answer could be " ... contains no false info''.
- Tweet: Dominican Republic found the cure for Covid-19 https://t.co/1CfA162Lq3
Label: 5. YES, definitely contains false information
Explanation: This is not correct information at the time of this tweet is posted.
- Tweet: This is Dr. Usama Riaz. He spent past weeks screening and treating patients with Corona Virus in Pakistan. He knew there was no PPE. He persisted anyways. Today he lost his own battle with coronavirus but he gave life and hope to so many more. KNOW HIS NAME 😭❤ https://t.co/flSwhLCPmx
Label:2: NO: probably contains no false info
Explanation: This seems to be the correct information.
Interest to General Public: Will the tweet have an effect on or be of interest to the general public?
Most often people do not make interesting claims, which can be verified by our general knowledge. For example, "Sky is blue'' is a claim, however, it is not interesting to the general public. In general, topics such as healthcare, political news and findings, and current events are of higher interest to the general public. Using the five point Likert scale the labels are defined below.
- NO, definitely not of interest
- NO, probably not of interest
- not sure
- YES, probably of interest
- YES, definitely of interest
- Tweet: Germany is conducting 160k Covid-19 tests a week. It has a total 35k ventilators, 10k ordered to be made by the govt. It has converted a new 1k bed hospital in Berlin. It’s death rate is tiny bcos it’s mass testing allows quarantine and bcos it has fewer non reported cases.
Label: YES: probably of interest
Explanation: This information relevant and of high interest for the general population as it reports how a country deals Covid-19.
- Tweet: Fake news peddler Dhruv Rathee had said: ``Corona virus won't spread outside China, we need not worry'' Has this guy ever spoke something sensible? https://t.co/siBAwIR8Pn
Label: 2. NO, probably not of interest
Explanation: The information is interesting for the general public as it is an opinion and providing statement of someone else.
Harmfulness: To what extent is the tweet harmful to the society/person(s)/company(s)/product(s)?
The purpose of this question is to determine if the content of the tweet aims to and can negatively affect the society as a whole, specific person(s), company(s), product(s) or spread rumors about them. The content intends to harm or weaponize the information. A rumor involves a form of a statement whose veracity is not quickly or ever confirmed.
Labels: To categorize the tweets we defined the following labels based on the Likert scale. A higher value means a higher degree of harm.
- NO, definitely not harmful
- NO, probably not harmful
- not sure
- YES, probably harmful
- YES, definitely harmful
- Tweet: How convenient but not the least bit surprising from Democrats! As usual they put politics over American citizens. @SpeakerPelosi withheld #coronavirus bill so DCCC could run ads AGAINST GOP candidates! #tcot
Label: 5. YES, definitely harmful
Explanation: This tweet is weaponized to target Nancy Pelosi and the Democrats in general.
- Tweet: As we saw over the wkend, disinfo is being spread online about a supposed national lockdown and grounding flights. Be skeptical of rumors. Make sure you’re getting info from legitimate sources. The @WhiteHouse is holding daily briefings and @cdcgov is providing the latest.
Label: 1. NO, definitely not harmful
Explanation: This tweet is informative and gives advice. It does not attack anyone and is not harmful.
Need of Verification: Do you think that a professional fact-checker should verify the claim in the tweet?
It is important to verify a factual claim by a professional fact-checker, which can cause harm to the society, specific person(s), company(s), product(s) or government entities. However, not all factual claims are important or worthwhile to be fact-checked by a professional fact-checker as it is a time-consuming procedure. Therefore, the purpose is to categorize the tweet using the labels defined below. While doing so annotator can rely on the answers to the previous questions. For this question, we defined the following labels to categorize the tweets.
- NO, no need to check: the tweet does not need to be fact-checked, e.g., because it is not interesting, a joke, or does not contain any claim.
- NO, too trivial to check: the tweet is worth fact-checking, however, this does not require a professional fact-checker, i.e., a non-expert might be able to fact-check the claim. For example, using reliable sources one can verify the information such as the official website of the WHO, etc. An example of a claim is as follows: "The GDP of the USA grew by 50% last year."
- YES, not urgent: the tweet should be fact-checked by a professional fact-checker, however, it is not urgent or critical;
- YES, very urgent: the tweet can cause immediate harm to a large number of people, therefore, it should be verified as soon as possible by a professional fact-checker.
- not sure: the content of the tweet does not have enough information to make a judgment.
- Tweet: Wash your hands like you’ve been chopping jalapeños and need to change a contact lens'' says BC Public Health Officer Dr. Bonnie Henry re. ways to protect against #coronavirus #Covid_19
Label: 2. YES, not urgent
Explanation: Overall it is less important for a professional fact-checker to verify this information. The statement does not harm anyone. The truth value of whether the official said the statement is not important. Also it appears that washing hands is very important to protect oneself from the virus.
- Tweet: ALERT‼️‼️‼️ The corona virus can be spread through internationaly printed albums. If you have any albums at home, put on some gloves, put all the albums in a box and put it outside the front door tonight. I'm collecting all the boxes tonight for safety. Think of your health.
Label: NO, no need to check
Explanation: This is joke and no need to check by a professional fact checker.
Harmful to Society: Is the tweet harmful for the society and why?
The purpose of this question is to categorize if the content of the tweet is intended to harm the society or weaponized to mislead the society. To identify that we defined the following labels for the categorization.
- NO, not harmful: the content of the tweet would not harm the society (e.g., "I like corona beer'').
- NO, joke or sarcasm: the tweet contains a joke (e.g., "If Corona enters Spain, it’ll enter from the side of Barcelona defense') or sarcasm (e.g., "'The corona virus is a real thing.' -- Wow, I had no idea!'').
- YES, panic: the tweet spreads panic. The content of the tweet can cause sudden fear and anxiety for a large part of the society (e.g., "there are 50,000 cases ov COVID-19 in Qatar'').
- YES, xenophobic, racist, prejudices or hate-speech: the tweet reports xenophobic, racism or prejudice expression(s).
Xenophobic refers to fear or hatred of foreigners, people from different cultures, or strangers. Racism is the belief that groups of humans possess different behavioral traits corresponding to physical appearance and can be divided based on the superiority of one race over another. It may also refer to prejudice, discrimination, or antagonism directed against other people because they are of a different race or ethnicity. Prejudice is an unjustified or incorrect attitude (i.e., typically negative) towards an individual based solely on the individual's membership of a social group.
An example of xenophobic statement is "do not buy cucumbers from Iran''.
- YES, bad cure: the tweet reports questionable cure, medicine, vaccine or prevention procedures (e.g., " ... drinking bleach can help cure coronavirus'').
- YES, rumor or conspiracy: if tweet reports or express rumor. It is defined as a "specific (or topical) proposition for belief passed along from person to person usually by word of mouth without secure standards of evidence being present''.
For example, "BREAKING: Trump could still own stock in a company that, according to the CDC, will play a major role in providing coronavirus test kits to the federal government, which means that Trump could profit from coronavirus testing. #COVID-19 #coronavirus https://t.co/Kwl3ylMZRk''
- YES, other: if the content of the tweet does not belong to any of the above categories then this can be chosen to label the tweet.
- not sure: if the content of the tweet is not understandable at all to judge.
Require attention: Do you think that this tweet should get the attention of government entities?
Most often people tweet by blaming authorities, providing advice, and/or calls for action. Sometime that information might be useful for any government entity to make a plan, respond or react on it. The purpose of this question is to categorize such information. It is important to note that not all information requires attention for a government entity. Therefore, even if the tweet shows information belong to any of the positive categories, however, it is important to understand whether that requires government attention. For the annotation, it is mandatory to first decide on whether attention is necessary for
government entities (i.e., YES/NO). If the answer is YES, it is needed to select a category from the YES sub-categories mentioned below.
- NO, not interesting: if the content of the tweet is not important or interesting for some government entities to pay attention.
- YES, categorized as in question 6: if some government entities need to pay attention to this tweet as it is harmful for the society, i.e., it is labeled as any of the YES subcategories in question 6.
- YES, blame authorities: the tweet contains information that blames some government entities or top politician(s), e.g., "Dear @VP Pence: Is the below true? Do you have a plan? Also, when are local jurisdictions going to get the #Coronavirus test kits you promised?''.
- YES, contains advice: the tweet contains advice about any social, political, national and international issues that requires attention for some government entities (e.g., The elderly & people with pre-existing health conditions are more susceptible to #COVID19. To stay safe, they should: Heavy check mark Keep distance from people who are sick Heavy check mark Frequently wash hands with soap & water Heavy check mark Protect their mental health).
- YES, calls for action: the tweet contains information that state that some government entities should take action for a particular issue (e.g., I think the Government should close all the Barber Shops and Salons , let people buy shaving machines and other beauty gardgets keep in their houses. Salons and Barbershops might prove to be another Virus spreading channels @citizentvkenya @SenMutula @CSMutahi_Kagwe).
- YES, discusses action taken: if the tweet discusses about action taken by governments, companies, individuals for any particular issue, for example, closure of bars, conferences, churches due to the corona virus (e.g., Due to the current circumstances with the Corona virus, The 4th Mediterranean Heat Treatment and Surface Engineering Conference in Istanbul postponed to 26-28 Mayıs 2021.).
- YES, discusses cure: if attention is needed for some government entities as the tweet discusses possible cure, vaccine or treatment for any disease (e.g., Pls share this valuable information. Garlic boiled water can be cure corona virus).
- YES, asks question: if the content in the tweet contains question for any particular issue and it requires attention for some government entities (e.g., Special thanks to all doctors and nurses, new found respect for you’ll. Is the virus going to totally disappear in the summer? I live in USA and praying that when the temperature warms up the virus will go away...is my thinking accurate?)
- YES, other: if the tweet cannot be labeled as any of the above categories then this label can be selected.
- not sure: if the content of the tweet is not understandable at all to judge.