Digital Decay and the Global Politics of Virtual Infrastructure

My work is based on the analysis of digital data. I began my PhD with an interest in the representation of identity and international security within online news, examining articles from the mainstream British press. I would carefully collect the URL for each article and dutifully put all of these in my reference. After my PhD, I stayed in the same vein, moving towards the examination of social media data. I came to this through a desire to look at the social media posts of arms manufacturers. I noticed that they had accounts on Twitter and thought it would be interesting to examine what they were saying and how. I began this project soon after submitting my PhD in 2019. The landscape is somewhat different now, and it is in this context that I reflect upon the issue of digital decay and the global politics of virtual archives, as well as key methodological issues of data preservation.

Recently, Twitter has been bought by Elon Musk and he has made a wide variety of policy and practical changes to the platform. One step he has taken, for example, is to decide any academic who uses software to automatically gather material from Twitter should have to pay a significant fee ($42,000 per month) in order to retain this information (Stokel-Walker 2023). The actions of Musk have sparked concern, driving some researchers to establish the Coalition for Independent Technology Research, which has started a campaign to share digital data. It is clear that Twitter is not what it used to be and what it is now is changing quite frequently. Around November 2022 – as has been the case on a regular basis since then – I was gripped by a sudden terror that my data was not properly backed up. In this work, a co-authored piece, we had taken screenshots and placed these in a document for analysis (I had done the same for single-authored work, putting screenshots into a Google Drive folder). In that sense, we still “had” the tweets. But was this really enough? Those who know me know that I take processes quite seriously, and I decided that the answer was “no.”

The reason for my concern is what is referred to as digital decay: we still had copies of the tweets, yes, but the links could potentially go dead. It is perhaps not a significant problem if this happens to one or two tweets, but potentially, the whole dataset was at risk. Digital decay is where items that were previously available online are erased either through deliberate removal or failure to maintain a virtual space (Gulotta et al. 2013). Returning to the aforementioned problem of backing up our tweet data, then, I immediately logged all links for my current project with the Internet Archive. The Internet Archive takes a snapshot of a page on a given date after it has been requested by a user, making it accessible forever (in theory). At the time I decided to capture this information in this space, it was quite clear that everyone else was doing the same thing, as the Internet Archive was very slow saving links from Twitter.

I decided to write this article after another incident, where I was writing a blog about one of my recently published articles (Jester 2023a). I wrote the blog and sent it off to be checked first by one person, and then by another, who would upload it. This took maybe six weeks and during this time a variety of the tweets that I had included as evidence were no longer showing using their original links. It is quite possible for digital decay to occur at any time. However, something odd occurred. The second person to check my blog diligently noticed that the tweets were gone, but they were also not gone. When I attempted to access them, they did not go to a dead page but were rather redirected to a 2006 post about Mountain Dew (from an individual, rather than a company). I had noticed that Twitter was starting to behave differently now, but this particular phenomenon was new. I found an alternative way of incorporating these materials into the blog post, but this episode led me to ask wider questions about what we do with our data within our research.

Digital research in international relations has seen a real flowering in recent years, including but very much not limited to Crilley and Pears (2021); Joachim and Schneiker (2021); Stengel and Shim (2021) Hedling, Edenborg, and Strand (2022); and Jackson et al. (2021). I have also noticed an increase in the number of students wanting to undertake projects relating to digital data. This could take a variety of formats, from examining influencers on Instagram to government outputs on Twitter to the global politics of TikTok. So before we go any further this is my plea to you, whether you are a student, a researcher, or both: if you are using work that incorporates digital data, ensure that you have saved what you need to save, in the right place (there are, of course, ethical issues regarding the use of digital data, but that would require another article, so I will hope that your own processes are ethical). There are a number of different ways in which this might be done.

Firstly, and perhaps most obviously you should consider logging this material with the Internet archive. This is a quick process and does not require any sort of registration but will mean that you and other people are able to access this material should it be removed. Other types of material are a little harder to capture. Memes are a good example; these are a form of communication that is intended to be reproduced and potentially reframed. Thus, it is often difficult to say who its original author is and, in any case, it gains currency from the number of times it is re-shared (see Baspehlivan (2023) for an excellent theorisation of the politics of memes). In this case, I would strongly suggest taking a screenshot of the meme and perhaps doing so in the environment in which you have found it. For example, if you see a meme on Twitter that you would like to use in your research, you could take a screenshot of the whole page (which you might then keep as a collection, stored in a single document). One type of data has long posed a challenge for me: videos. There are a variety of video capture software available but another lower-tech solution – that has its own benefits – is the spreadsheet. When I collect my own data, I often use a snapshot approach, collecting all tweets within a given timeframe. I add all these to a spreadsheet, adding image and video descriptions. Many videos posted on social media are quite short and as a result, I normally opt to transcribe these and add this to my spreadsheets also.

More broadly, I think researchers need to see this as a moment of reflection on what digital data is and how we understand it as knowledge. My work is partly located within secrecy and transparency studies. As Langlois and Elmer (2013:2) argue, with respect to social media in general, “The first challenge is ontological, in that it requires that we switch our attention away, for a minute, from what is being said (posted, commented, and so forth), to how it is being processed and rendered.” There is probably a lot more to be said about how specifics of digital decay shape our research and how we might understand this ontologically and epistemologically: what can we know about what is no longer there, and how can we know it? I have considered this very briefly elsewhere (Jester and Dolan Forthcoming), but I would like to see more written within this space. Further, as Littman (2019) notes, those using software to automatically “scrape” social media data must agree not to use data that has been removed (not a problem for those of us who do not sign such agreements, though there may be ethical issues to consider). This means that brands are able to use these rules to reduce scrutiny, as a form of brand management. In an international relations context, this logic could also be applied to international institutions, states, or corporations that often have social media accounts – indeed 97% of UN member states have a Twitter account according to Duncombe (2019).

This should also prompt reflections on how society views the archive as a concept in 2023. There is an interesting body of work on the use of archives within international relations, much of it drawing upon historical approaches (e.g. Mulich 2021; Connelly et al. 2021; Morefield 2021). I am not a historian and visit archives occasionally in my spare time for interest, though I do not use them in my research. But recently I have been thinking about archives more; this speaks to the questions of what is considered to be worth preserving (typically a political choice), and whose job that should be. I mentioned the Internet Archive earlier: this does not receive government funding and is currently at risk as a result of a lawsuit from major publishers (Burga 2023). Digital archives more broadly are also subject to global political agreements and regulations. In 2019, for example, the British Library warned that exiting the European Union (and related legal agreements) would cause copyright problems for digital archives such as Spare Rib a groundbreaking British feminist magazine (O’Carroll 2019). Thankfully, this never came to pass but it does highlight the global nature of the politics of virtual infrastructure.

One of the significant benefits of virtual archives is that it is much easier for ordinary people to get involved: if you have access to a laptop or smartphone, all you need to do is copy the URL, go to the Internet Archive and log it. You can also view items here, too. Both logging and viewing can be performed by anyone in the world, regardless of their location, provided they can access the relevant internet domain. In that sense, the Internet Archive is quite democratising. At a time when arts and history organisations across the world are facing funding cuts, this can provide valuable labour (though this can also mean that “archiving” is not conducted to the same standards as it would be by professionals) (Eveleigh 2016). Is the virtual archive the kind of work that should be undertaken by research centres, by an individual government, or an international organisation, such as the European Union? The internet crosses every global border which creates further questions about whose responsibility governance is, who the ultimate arbiter is, and who should pay for regulations, infrastructure, and enforcement. There is no straightforward answer to any of these issues. This conversation is especially vexed because Internet usage across the world also varies quite significantly, with some places using less than others, and with some states shutting the Internet down either temporarily or permanently to prevent their citizens from accessing it. As Drezner (2004) has noted, this may be an issue of global governance, but states still have an important role to play.

Enhancing this complexity is the issue of finance and capital. Digital media corporations sell advertising and also pay social media management companies to act on their behalf (Jester 2023b). Put differently: there is a significant amount of money at stake with regard to all of the above questions. What happens, however, when a company owner acts in a way that loses money and puts the company at risk? It is certainly the case that Twitter has lost a significant amount of money in the last year, prompting users, researchers, and tech writers to ask: “Is Twitter finally dying?” (Ghaffary 2023). Twitter is not a perfect platform, I recognise that, but it has done a significant amount of good, from connecting marginalised people to sending out information about disasters. Taking this question from a very different perspective, we might consider the role of antitrust laws, which seek to prevent dominance by individual companies following the restriction of competition. With regard to digital spaces, there has recently been concern about data-gathering practices from the largest companies. Meta – which owns Facebook, Instagram, WhatsApp, and now Threads – lost a case in July brought by the EU Court of Justice (Chee 2023). An article by Tucker and Marthews (2011) – written just over ten years ago and arguing that antitrust was not likely to be an issue for social media companies – shows just how quickly matters have changed within the digital sphere. We might also need to consider, then, who gets to decide how these companies are run. Who should get to decide? Should CEOs be able to shut down social media companies or not?  

My thoughts on these matters began with Twitter. But in considering how to address this as a research problem, we must think more deeply about the politics of virtual infrastructure. If Twitter goes down and the Internet archive closes, a huge amount of information will be lost. The inability to access information such as this may, perhaps, reflect an anti-democratisation of knowledge where previously the internet had opened it up to the masses. I do not have all of the answers to these problems or issues, but my aim here has been to provoke us all to think more about them. In conclusion, this article has argued that, as researchers, we should be paying more attention to digital data. The recent acquisition of Twitter by Elon Musk has also prompted me to be more reflexive about issues such as digital decay, where information can simply disappear from the Internet. More broadly, this is congruent with wider questions about the politics of Internet infrastructure around regulation, copyright, and questions about who should be responsible for Internet governance. Finally, and perhaps most concretely, I hope that this article might act as a call to those of you who have not yet considered this in detail, to think further about the use of digital data within your own work and how this might be backed up.


