Update 2020-06-07: I have done more in-depth investigations into TikTok, whose results can be found at tiktokodds.com.
The communities I am part of rely heavily on longform text for communication. The world seems to be moving away from text (and especially longform text).
One clear example is TikTok, which is the most downloaded social media application. It centers around sharing short videos. I’m interested in whether important concepts can be communicated via this medium.
As part of that, I researched more about the recommendation algorithm. This has led me to some success (e.g. a coronavirus video with 400,000+ views). Because I found it very hard to get useful information about TikTok when I was doing this research, and because I want the sort of person who would read this post to get wider visibility for their ideas, I am writing this summary.
Most TikTok videos are viewed through the “for you page”, roughly analogous to the Twitter feed. The TikTok algorithm recommends videos for you to view on this page. Note that, unlike with Twitter or Instagram, a large fraction of the content comes from creators that the user does not follow.
The TikTok recommendation algorithm is proprietary and mostly confidential. We know only a few things through information that employees have leaked to the press.
The TikTok recommendation algorithm consists of two components: an automated component and a manual component.
When a user creates a video, TikTok analyzes the video to identify the target audience for the video. They claim to use a combination of discrete fields (namely the hashtags used by the author and the sound chosen), natural language processing (presumably analyzing any text which is overlaid in the video), and computer vision. For example, they might analyze your video to find that contains images of a cat, has text like “meow” and the hashtag “#catsoftiktok”. They will use this information to identify an audience of people who like cat videos.
They then create a small sample audience of cat-video-lovers who get shown this video. If this audience likes the video, it will be shown to a larger audience. If that audience likes it, it will be shown to a still larger audience, etc.
Whether the audience likes the video is some function of how they engage with it: do they watch it the whole way through, like it, comment, share, etc. A common heuristic is that a video needs at least 10% of the viewers to like it in order to advance to the next stage.
Reportedly, moderators manually review videos, once after the video has ~100 views, again after it has ~8,000 views, and a third time after it has ~20,000 views. At each stage, the moderator can set different levels of visibility, which determine how widely the video will be shown. These levels of visibility are not visible to the author.
Early leaks of the moderation guidelines showed that they included limiting the visibility of videos containing disabled and possibly also LGBT+ actors, as well as politically sensitive content like protests.
The Intercept recently obtained a more full leak of TikTok’s moderation guidelines. The guidelines mostly attempt to limit sexually “explicit” content (“explicit” in scare quotes because the guidelines include things like banning people for wearing bikinis when not swimming) and politically “controversial” content (notably content that makes China look bad).
Interestingly, moderators are also instructed to prohibit “ugly” people and environments (slums etc.). The motivation here seems to be that new users will bounce if they are presented with videos of ugly people/places.
On the other side of things, TikTok apparently hired contractors to steal content of “nice-looking” videos from Instagram and repost it on TikTok. The #BeachGirl hashtag was specifically mentioned as a source contractors should use.
My guess is that there are two separate moderation goals at play: one is politically motivated (e.g. limiting videos about Tiananmen Square), and the second is targeted towards increasing engagement (e.g. displaying videos from more attractive users). TikTok’s official party line seems to be that these guidelines are “locally” produced (implying that the political censorship only happens in China).
It seems likely that the number of views any video can expect to receive should be modeled as 4 separate models: one which is applicable when the video has less than 100 views, one for the 100-8000 regime, one for 8000-20,000, and one for 20,000+. (Corresponding to the thresholds where manual review takes place.)
For simplicity, I will just assume there is a single model. The process of iteratively showing videos to larger audiences implies a distribution like: 1/2 chance of being shown to 10 people, 1/4 chance of 100, 1/8 chance of 1000, etc. More generally, this implies some distribution like
For some parameters .
Noting that we can re-parameterize this as
We can see that this implies a power law distribution. Indeed, Chen et al. 2019 found that views on the most popular videos were Zipf-distributed.
Looking at my own videos, a simple power law distribution fits reasonably well. (These numbers are using the model .)
My intuition is that most content is not remarkably changed by manual moderation. (As one intuition pump: TikTok wants to create videos which are popular among its target audience, and probably the empirical evidence about whether a video is popular among its target audience is better correlated with this metric than the opinions of some random moderator.) This implies that it’s more important to focus on automated algorithm optimizations.
This section is a brainstorm of ways one can optimize their video for wider spread distribution. They all seem like reasonable hypotheses to me based on the above information, but I have no real evidence to support them.
Also, it’s worth pointing out that TikTok has fascist moderation policies and optimizing for fascist moderation is maybe a bad idea.
Douyin is a sister product to TikTok. Several trends on Douyin do not seem popular on TikTok yet. For example:
Interestingly, several young users reported that they like watching and sharing videos in which the creator is sharing good-looking profile images for them to adopt for their social accounts, fancy nicknames for online social accounts, and creative animated mobile phone wallpapers for individuals or couples (Figure 1). This shows a trend that the content of some short videos can be easily “materialized” and adopted in other social channels, and the videos penetrate into users’ real life. As noted by P28, a male student who was in a relationship:
“My girlfriend really likes to adopt the profile images recommended by a content provider as her WeChat account. She says that those profile images are beautiful and can fit with her personality well. She also likes to adopt animated mobile phone wallpapers shared on Douyin. Sometimes those wallpapers are for a couple, which are very creative in that two phones can create a holistic story when put together, and she always urges me to adopt those wallpapers together with her. It is a way to show to her friends that we are in a sweet relationship. ”
Lu and Lu 2019
Academic literature presumably lags trends by a significant amount, so if you have direct access to Chinese or other markets where you can copy trends more rapidly, that’s probably better.
TikTok’s recommendation process makes coordinated voting hard. Because the decision about promoting a video is based on how a randomly selected audience interacts with it, having your friends all like your video doesn’t do much.
One possibility is that you could ensure that your friends are disproportionately represented in the randomly selected audience. For example, you could have some hashtag that only you and your friends use, and TikTok might therefore automatically choose your friends to be in the sample audience.
This is just a hypothesis, and I have low confidence it will work.
I would like to thank JP Addison for discussions that led to this work, particularly the observation that EA is heavily tied to longform text.