The purpose of this analysis is to use data compiled by Omelas to measure Chinese Communist Party (CCP) influence over the digital environments of the 17+1 countries. By using exploratory data analysis, correlations analysis, and regression modeling, we are able to gain insights into the effectiveness of Chinese influence operations in these countries.

From October 1, 2021, through April 1, 2022 Omelas analyzed over 820,000 social media posts, communication app messages, and online articles targeting countries currently or formerly part of the Cooperation between China and Central and Eastern European Countries Initiative (hereafter 17+1) in order to understand the extent and success of Chinese digital influence operations in each country. We tested the metrics to see their correlation with polling data on each country’s favorability towards China and used the metrics with a statistically significant connection to create a weighted score to summarize the online influence of China and pro-China narratives in each country. Below is the initial ranking. Scores are on a logarithmic scale, so a score of 2 represents 10 times the influence of a score of 1; and a score of 3, 100 times.

Results

CountryDigital Influence
Lithuaniano influence
Estoniaminor influence
Polandminor influence
Sloveniaminor influence
Latviaminor influence
Croatiaminor influence
Czechiaminor influence
Romaniaminor influence
Hungarysome influence
Greecesome influence
Albaniasome influence
Montenegrosome influence
Slovakiasome influence
Bosnia and Herzegovinasome influence
North Macedoniaconsiderable influence
Bulgariaconsiderable influence
Serbiaconsiderable influence
Chinadominant influence

Results by Country

Lithuania

3.93 (no influence)

China has made next to no investment in digital influence operations in Lithuania. The Lithuanian embassy has a Facebook account that rarely posts and accrues few engagements. Positive posts about China receive more engagement than negative posts in domestic media but at a lower rate than elsewhere.

Estonia

5.18 (minor influence)

China’s only presence in Estonia is the Facebook page of the embassy which posts sparingly and attracts few engagements. Domestic media is largely neutral towards China but leans negative.

Poland

5.18 (minor influence)

While China has invested heavily in Poland, establishing a CRI outlet and multiple Confucius Institutes, all with a multiplatform social media presence, the yield on that investment has been minimal. Polish media is neutral to negative on China, while the approval rating of China is the third lowest among the 17+1 countries.

Slovenia

5.20 (minor influence)

China has made a minimal investment in online propaganda in Slovenia, with only a single Embassy account on Facebook and that account generating single-digit engagements. Domestic media is neutral, and negative stories on China perform well compared to positive ones.

Latvia

5.36 (minor influence)

China possesses both an embassy account and accounts for a Confucius Institute in Latvia, representing its largest digital presence in the Baltics. Nonetheless, these accounts perform poorly. Domestic media is neutral towards China, and negative stories perform well.

Croatia

5.37 (minor influence)

China has a dedicated CRI branch, embassy accounts, and Confucius Institute accounts targeting Croatia. These outlets generate a significant amount of content but form a small share of the overall Croatian media landscape and attract few engagements. Negative coverage of China performs well.

Czechia

5.49 (minor influence)

China has a dedicated CRI branch, embassy accounts, and Confucius Institute accounts targeting Czechia. Despite over a million followers for CRI Czech’s Facebook page, engagement rates are abysmal, with posts rarely generating more than a handful of likes, largely in line with China’s tactic of boosting Facebook follower counts through bots or click farms. Chinese investments in Czech media appear to have borne some fruit as domestic coverage is more positive than in neighboring states although positive stories tend to perform poorly.

Romania

5.49 (minor influence)

China has a dedicated CRI branch, embassy accounts, and Confucius Institute accounts targeting Romania, producing more content than in any other country in the sample besides Croatia. Despite the prolific production, engagements are minimal. Domestic media is more negative towards China than in any other country observed.

Hungary

5.78 (some influence)

China has a dedicated CRI branch, embassy accounts, and Confucius Institute accounts targeting Hungary. While these accounts produce a huge amount of content (3rd overall), they average fewer than one engagement per post. Domestic media is slightly negative towards China. Positive posts on China perform well and approval towards China, though low overall, is among the highest in the region.

Greece

5.79 (some influence)

China has a dedicated CRI branch, embassy accounts, and Confucius Institute accounts targeting Greece. These accounts receive significant engagements per post but post infrequently. Domestic media is highly negative towards China.

Albania

5.79 (some influence)

China has a dedicated CRI branch, embassy accounts, and Confucius Institute accounts targeting Albania. CCP accounts generate more engagements in Albania than in any other 17+1 country, and domestic media is slightly positive towards China. Nonetheless, positive stories do not perform especially well and approval ratings for China remains low.

Montenegro

5.80 (some influence)

China’s only online presence in Montenegro is the embassy’s Facebook page which rarely posts and attracts few engagements. Domestic media is neutral towards China but positive stories perform incredibly well compared to negative stories, better than in any country in the sample besides North Macedonia.

Slovakia

5.81 (some influence)

China’s online presence in Slovakia is through the embassy’s Facebook page, which rarely posts but attracts significant engagements when it does, and the embassy’s Twitter account. CCP accounts perform better in Slovakia than any country besides Bulgaria. Domestic media is neutral towards China.

Bosnia and Herzegovina

5.91 (some influence)

China has made a minimal investment in online influence in Bosnia, with the embassy’s Facebook account the extent of its online presence. Domestic media is highly positive when reporting on China, though negative stories perform well.

North Macedonia

6.01 (considerable influence)

China has made a minimal investment in online influence in North Macedonia, with the embassy’s Facebook account the extent of its online presence. Despite the lack of direct influence, domestic media in North Macedonia is more positive towards China than in any other country in the sample. Positive stories perform better in North Macedonia than in any other country in the sample.

Bulgaria

6.37 (considerable influence)

China has a dedicated CRI branch, embassy accounts, and Confucius Institute accounts targeting Bulgaria. These accounts receive the highest engagements per post out of any country in the sample. Domestic media is neutral towards China and positive stories do not perform exceptionally. Nonetheless, the approval of China is higher in Bulgaria than in any 17+1 country besides Serbia.

Serbia

6.49 (considerable influence)

China has dedicated minimal direct sources to online influence in Serbia, with only the embassy having an online presence. Despite this, domestic media is highly positive towards China and positive stories perform better than anywhere in the 17+1 countries besides Montenegro. China’s approval rating in Serbia is 20 points higher than in any other 17+1 country.

China

7.84 (dominant influence)

China is used to establish ground truth. If we’ve developed the index correctly, we should see Chinese influence within China as many times higher than in any 17+1 country. With the summary score 22x higher than Serbia’s (the score is on a log scale), this reaffirms the validity of the index.

Methodology:

Data Collection

Omelas collected the accounts, channels, websites, and RSS feeds of the top 10 newspapers and top 10 TV channels (collectively “outlets”) in select countries, ranked by visitors to their websites, excluding any sports or entertainment outlets. We then partnered with local experts to supplement the initial list with additional important outlets. On all content, Omelas collected basic data on text, date posted, and location of post. Additionally, Omelas used our proprietary named entity recognition and mapping to find references to China and people, locations, and organizations associated with China. We applied sentiment analysis at the sentence level, scoring each from most negative -1 to most positive +1. We collected engagements (likes, shares, comments, and equivalents) on initial ingestion of data and then updated those numbers three times in the 72 hours after initial ingestion.

Feature Engineering

We start with a total of 10 variables that can be grouped into two categories:

  1. Digital influence variables: these come from proprietary Omelas data and serve as measures of CCP digital influence in individual countries. You can find a copy of the dataset here.
  2. On the ground variables: we use CCP polled approval data based on Gallup polls as a measure of citizens’ attitudes towards the Chinese government.

From the 9 digital influence variables we use feature engineering to construct two more variables:

  • The number of CCP engagements per CCP post (engagements/posts) which gives a measure of how many people, on average, interact with CCP posts digitally.
  • The ratio of positive China-domestic-coverage engagements to negative domestic engagements (engagements on positive domestic posts/engagements on negative domestic posts) adds information on whether negative or positive coverage of China gains more traction.

After adding the two variables mentioned above we have a total of 12 variables as shown in the data dictionary below.

Data dictionary:

FeatureDescription
CCP PostsCount of posts from the CCP
CCP Post ShareShare posts from CCP representing all posts
CCP EngagementsSum of engagements from the CCP
CCP Engagement ShareShare engagements from CCP representing all engagements
Domestic Posts about ChinaPosts from domestic media that reference China
Domestic Positive Posts about ChinaPosts from domestic media that reference China with a calculated sentiment of .1 and above.
Domestic Negative Posts about ChinaPosts from domestic media that reference China with a calculated sentiment of  -.1 and below.
Eng on Domestic +PostsEngagements on positive domestic posts.
Eng on Domestic -PostsEngagements on negative domestic posts.
CCP engagements per postThe ratio of the number of CCP engagements divided by CCP posts
Ratio positive to negative domestic engagementsThe engagements on positive domestic posts/engagements on negative domestic posts
CCP Polled ApprovalCCP polled approval ratings

Feature Selection

In order to find the relative importance of the above variables and their effect on CCP digital influence, we use a combination of statistics and data science methods including variable inflation factors and regression modeling. Here we make the assumption that there exists a relationship, correlational or otherwise, between our features and the relative attitude of citizens in the 17+1 countries towards China. By only including features with a statistical relationship to approval ratings, we’re able to account for features that could be manipulated by inauthentic behavior online.

Variable Inflation Factors

We first use Variable Inflation Factors (VIF) to reduce multicollinearity between our variables. This results in a reduced set of 5 variables.

FeatureDescription
CCP Post ShareShare posts from CCP representing all posts
CCP Engagement ShareShare engagements from CCP representing all engagements
CCP engagements per postThe ratio of the number of CCP engagements divided by CCP posts
Ratio positive to negative domestic engagementsThe engagements on positive domestic posts/engagements on negative domestic posts
CCP Polled ApprovalCCP polled approval ratings

For a refresher on multicollinearity see this blog post.

Regression Modeling

Using the reduced set of variables above, we investigate the relationship between these digital influence variables and Chinese polled approval through regression modeling. More specifically, we use Ordinary Least Squares (OLS) regression to see if there exist significant relationships between our independent variables and polled approval (our target variable).

Using a significance level of .05 we interpret the coefficients above as follows:

  • A one-unit increase in CCP engagements per post is associated with a .0030% increase in CCP approval based on our data. Or said another way 100 more CCP engagements per post result in a 3% increase in CCP approval.
  • A one-unit increase in the ratio of positive to negative domestic engagements results in a 0.0006% increase in CCP approval. Which is equivalent to a 1000-unit increase in the ratio being associated with a .6% increase in CCP approval.
  • A one-unit increase in CCP Engagement Share is associated with a .000008% increase in CCP approval.

From the above, we select the significant features to include in our index. Note on the above modeling: Future iterations of the model could be significantly improved by expanding the analysis across time and location. The inclusion of more countries and the inclusion of changes in online features and approval over time would allow for a more rigorous model.

Creating the Index

Combining feature engineering and feature selection we then create an index to approximate the digital influence of the Chinese Communist Party by using a linear combination of the significant and non-multicollinear variables. We do this such that:

C_index = x1w1 + x2w2 + x3w3 + x4w4

Where C_index is the estimated digital CCP influence , xi are the variables as described above wi are weights based on significance and scale.

Finally, we make the assumption that CCP digital influence worldwide follows a normal distribution and use a logarithmic transformation to approximate our index into a gaussian. This allows us to create four buckets based on the digital CPP influence predicted by our data as follows:

CCP Index rangeDigital Influence
< 5.0No influence
5.0-5.5Minor Influence
5.5-6.0Some influence
6.0-6.5Considerable influence
6.5-7.0Major influence
> 7.0Digital dominance

For the specifics on the code to compute the index see the Developing the Index section of this Colab notebook.