Many fields of knowledge can hit a human limit. There is only so much one person can know, research, or understand. Biases and narrow focuses might prevent even the knowledgeable from drawing the best conclusions. And there are some subjects that might exceed human cognitive capacities altogether without the right tools - a hurricane was unknowable until data, statistical methods, and simulations unimaginable a century ago came along.
There is much left to learn about history even with the information we have, but I believe that history and historical research can be vastly improved with the careful incorporation of certain AI technologies. This will not only allow us to gain greater knowledge of the past, but can be used to enhance our understanding of the patterns and dynamics that shape human society.
Why History
AI is the next frontier of information technology and has the potential to fundamentally transform nearly all knowledge work. Different fields will change different amounts depending on the nature of the tasks involved, the technology that becomes available, and the willingness of the practitioners to adopt and adapt to emerging possibilities.
While there is exciting work ongoing on different scientific and technological fields, there is less happening in Humanities research. I believe that the Humanities have a strong potential to be made far more potent by advances in AI. Furthermore AI-enhanced Humanities may serve crucial societal functions as we enter a rapidly changing future.
I’m going to focus on broad big-picture history and imagine what AI can do with this field in the near future and the technologies we need to enable that. It is my hope that outlining the potential and construction of such a system will help make the best version of it more realistically achievable.
The Challenges of History
The philosopher and historian Charles Taylor published his examination of the development of Secularism in the West in 2007 when he was 75 years old. Taylor offers a comprehensive account of how the religious landscape of the modern world completely changed in the past few centuries, a transformation which has deeply affected our daily lives, understanding of morality, and relations with each other. The work is based on a lifetime of scholarship embedded in the early modern intellectual history of Europe and is clearly incomplete. Taylor is only able to offer a partial account of the full history that began the modern world and defines our contemporary moral systems because a full account would require more years than he has (and he had other books to write!). A Secular Age would be enhanced with a full examination of Secularism in Scandinavia or the Iberian peninsula or even more detail of the regions and times it already covers. The work almost demands follow-up works looking at religious modernization in the Islamic, Hindu, and Buddhist worlds; works which as of 2023 exist not at all or highly imperfectly.
There is a lot of history. And history is only getting bigger. Not only does time march on with more and more people, but in the 21st century we have gathered an incredible number of primary sources for nearly every historical time period, not to mention the growing centuries of secondary interpretation. Given human longevity and capabilities, it is impossible to read all of it. In fact, it may even be impossible to read all the sources on just one subject - the French Revolution, the Taiping Rebellion, the Vietnam War all are large and consequential events which exceed the understanding of individual people or teams of researchers. We necessarily rely on summaries on top of summaries on top of summaries to make any sense of such world events and synthesize them into a single world event. Each level of summary involves a loss of information and a further removal from the primary sources. These summaries often turn out to be untrustworthy, and every generation of history involves a new examination of the sources, starting again from the beginning.
This is one explanation for why history is so difficult, and why so many historians end up with opposite interpretations of the same events. Through hard work, we know much about such major events but we often lack a working consensus on what they mean or what they can teach us.
Academic history has partially responded to this by attempting to focus narrowly on only what can be known with legible certainty. Typically historians start their careers writing highly specific books and gradually move up to writing more broadly, and for more popular audiences. However they usually hit a limit: if a historian goes too broad, they will miss the details and fail to deliver. And they hit the limit disappointingly quickly. There are many areas of history that we know very little about from lack of serious research into them and there are topics worthy of extensive and broad treatment that have not yet received it.
Part of why we don’t yet have this history is because of academic incentives and lack of public interest, but part of it too is that works such as Taylor’s run up against the limit of a human lifetime of work. One would be tempted to solve this through the use of teams of researchers who can specialize and cover much more knowledge than just one person, however this does not seem to really work in practice. As with many humanistic disciplines, the final synthesis of all the facts into transmittable understanding seems to rely on a single consciousness having all of it at once, and the intellectual connections between teams or collaborators are not strong enough to do this reliably. The academic collaboration and availability of books in the present age are already huge augmentations to this process and have enabled incredible gains, but they are not enough.
To make history better, we need to expand consciousness. Expand the amount an intelligence can learn and understand in a single working lifetime, and ideally empower them to go much beyond our present capabilities. If this is to happen in the near future, the best bet is some enhancement involving AI, whether through AI tools which significantly augment a human consciousness or using our full human toolkit to make AI systems capable of advanced historical understanding.
Decomposing History
My ideal method for solving this problem would be to enhance Charles Taylor with infinite longevity, superhuman reading speed, and unnatural recall ability but this seems out of reach for now. But it might be within reach to reimagine historical research with the emerging tools of Artificial Intelligence.
We are very far away from Large Language Models (LLMs) such as GPT-4 being able to do all the research, synthesis, and explanation from a simple prompt. Furthermore the current problems with LLMs such as hallucination are not the ideal for good history which requires very careful rooting through facts and explanatory ability. As AI progress is unpredictable it is possible that all these problems may simply be solved in a single leap in a few years time, but my current bet is that this will not happen and it is worth working on the problem through other methods.
A more promising immediate approach is to tackle this through task decomposition. By breaking the historical method into the various processes of research, synthesis, organization, and explanation it becomes much more doable for different semi-specialized AI systems to assist with these processes. Furthermore if one key component is missing, say that AI can do research and writing but has trouble determining which parts of research to use, a human can step in to complete the process. None of these processes are fully independent, and most versions of this system will require human supervision and verification at each stage, but the gains are still potentially quite large even with limitations.
While it would be amazing to make an AI that can do this whole process in extremely fast computer time, there are truly significant gains to be reaped from even more modest productivity gains. The historian Robert Caro has devoted five decades of his career to writing about Lyndon Johnson; if he could write three biographies of equivalent quality, that would be an immense benefit to humanity. Other modest improvements might include increasing the ability of historians to cover more subject areas with confidence they are not contradicting some known fact. Tools like an automatic fact checker would likely significantly improve existing works of history which now suffer from minor errors.
As an example, a historian might be writing a history of European monarchs’ relationships to their subjects. It might be quite important for them to understand how the Russian people went from obedient Tsarists in the mid-19th century to rather indifferent citizens who scarcely made complaint when Nicholas II abdicated. Normally this would take months of reading - general histories, economic examinations of rural and urban change, press accounts of the sovereigns and their families, social histories of different minority populations, first hand diary accounts. It would require deep familiarity with the Russian language, but also the languages of the many subject populations of the Tsar, as well as of neighboring nations. Ideally a highly capable history AI with many decomposed abilities would be able to summarize all these sources, understand each of their biases and limitations, synthesize different micro and macro level understandings and help the historian to integrate and compare this example to that of other nations examined in their more general history. If the historian makes a claim, the AI will be able to affirm or contradict them, and carefully cite all relevant sources if interrogated. The historian will still have to work hard to gain a full understanding of this period in history, but the months will be reduced to weeks or days, and they will be able to have full confidence that their knowledge is checked and verified at a similar level to having a colleague with a different expertise look over their manuscript.
Here is a first attempt at decomposing the process of writing history into a series of subtasks along with some speculations on what sorts of AI systems could be used to achieve them, as well as obstacles to that. The hope is that each module will output something useful both to an AI or a human in the chain. I am not a professional historian and have not written a history book, but I’m hoping that my experience researching and writing historical essays will serve well enough to take a first stab at this. I am intending this as a living document and welcome comments and critiques.
Stages of Development
AI for enhancing historical research must begin by enhancing and augmenting existing historical research rather than trying to replace it. Furthermore, the historians who use such a system in its early stages must be deeply familiar with the systems and any errors or imperfections, whether small or systematic, it might introduce into the historical research process. This project therefore fits into the Cyborgism research paradigm.
This AI system likely begins as small enhancements to existing non-conceptual tasks. This includes fact-finding as well as tracing sources and references, often extremely laborious tasks which can be performed quite well by artificial systems. In this case, all actual processing of facts and sources is performed by the human historian. It is likely that being embedded in the artificial system drastically increases the amount of information they are able to gather, but the AI is not involved with the knowledge itself.
In the next stage, the AI becomes more like a research assistant, or even better a team of research assistants. The AI is able to proactively seek out sources, perform analyses, and write summaries of sources or domains. At present, the research assistant relationship can be imperfect as often assistants are still learning and need their work checked and confirmed - it cannot be trusted unless the primary researcher verifies it. This same relationship will likely play out similarly with the AI, and the AI may start out making far more errors and be slower to learn. However the AI will never become its own separate researcher uninterested in continuing to help, the historian will have the opportunity to improve the system, get to deeply know its capabilities and limitations, and integrate it holistically into their longterm knowledge acquisition and processing practice. This may be further accelerated if multiple historical researchers use the same systems and contribute collectively to its improvement. This stage naturally progresses from the AI working as a research assistant to the AI working as a collaborator. Academic collaborations between two or more researchers with different specialities can often be extremely fruitful, though naturally run up against problems of conflicting personalities or priorities. Hopefully an AI system can steer away from these pitfalls and make something still more promising. In terms of increasing confidence and accuracy of information, one could have a potential manuscript reviewed by a small army of grad-student equivalents with expertise in different specialties. These would still miss subtle mistakes, but if an expert on the Soviet economy makes a minor error about internal Chinese politics in a book on trade, AI modules specifically specialized in China might catch it.
Once the AI is a collaborator, the researcher has been dramatically enhanced with the amount of information they are able to gather and understand. There is only so much information they can keep in their memory or their notes, but the AI collaborator has exceeded both. Furthermore, the researchers knows and understands the AI system well enough that any errors or lapses in the AI’s abilities can be compensated for by human understanding. At this point the historian is able to produce more research at a much higher quality. If they want to enhance their understanding of European monarchs’ relationship to their subjects, they can quickly get information - a summary of the development of small German states, notable reactions to the 1848 revolution, a speculative analysis informed by agricultural sciences on how natural conditions may have affected local circumstances, a cross-check on all primary sources to see if these natural conditions show up in 1st person accounts, a probability estimate of how widely read such accounts were by the nobility themselves, a comparison of the same analysis with relationships between local nobles and subjects in the Hapsburg empire. This is not performed instantly, and throughout the researcher is carefully considering the next steps and perhaps even “discussing” what makes sense with the AI system. But research that would take months now takes hours. In these circumstances, the augmented historian knows more and better than their unenhanced self.
The final stage, which may be impossible, is to take this greater understanding several steps further. What would it be like to understand World War II by understanding not just the battles, the economics, and the ideologies, but to understand the individual biographies and motivations and feelings of every soldier and civilian involved? And if this is possible for any one intelligence to understand, is it possible that this deep understanding can lead to a greater and higher synthesis of history? This is unknown, it may well be beyond the capabilities of the human mind and there is not yet strong thinking on how an artificial intelligence could do these things either. But if such an understanding is possible, it would be best to reach it through the previous stages to ensure that it is indeed the same kind understanding we care about.
A Better Humanities
If this AI system comes into existence, it may end up simply augmenting and enhancing history as it currently exists. But it may well be possible to do more.
The Humanities have existed as long as literate culture but have encountered persistent problems especially compared to their cousin sciences. Literary and art criticism, philosophy, history, and linguistics all have struggled with defining themselves and their relationship to truth. While all these disciplines have progressed, they have progressed without certain conclusions or easily distilled knowledge. Furthermore, their recent concentration in the contemporary university system replacing independent scholars has led to a tragic disengagement with an increasingly literate public.
Partisans of the exact sciences often tend to dismiss these issues as being inherent to the disciplines themselves. There is ultimately no truth to philosophy or to the meaning of a work of art, and the professors in these fields are deliberate obscurantists holding onto tenure through societal inertia. But what if the problem is not the disciplines, but our methods? Modern science was just another branch of speculative philosophy before its practitioners discovered the power of repeated experiments. Much of the capability of recent science was unlocked thanks to the invention of statistical methods allowing the requisite understanding in one mind of much larger amounts of data than were previously possible. One cannot truly understand a revolution before it has happened and this is all speculative, but I think there is a chance that greater understanding of the multiplicities of human life in a single consciousness might enable a new kind of Humanities. And this new understanding might be communicable and widely distributable to the general population.
If fully successful this historical AI will be able to understand both the broad patterns of history and the individual emotional moments of every person involved. Events will be understandable simultaneously through their economic impact, the political ramifications, and the hopes and dreams of every participant. Such an understanding is currently only attributed to God’s, but in the spirit of seeking the truth we ought to be pursuing it.
Applications
A more confident Humanities might also enable far greater uses than are currently imaginable. A truly functional History AI which understands the past perfectly would likely be capable of predicting the future with a high degree of accuracy. It would be able to advise on politics and policy based on a full knowledge of past results and the specific circumstances which led to good and bad outcomes. Historians have often played at this role in the past, we can only guess at how useful a significant enhancement would be.
Later stages of this project have a potential to look like Psychohistory, Isaac Asimov’s fictional evolution of history as a predictive science. Economics has made major strides in understanding large-scale societal phenomena through mathematical methods, but still has obvious limitations and has an extremely spotty track record for longterm predictive accuracy. Finance has used some of the same techniques in narrow domains and short timescales, but still fails to account for more complex phenomena. Part of what is missing from these fields is the much wider field of history. We have no good science for predicting when countries will go to war and who will win even though these have major tangible consequences on many aspects of life even in countries far from the conflict. Recent advances in forecasting suggest that prediction is a learnable skill, but even the most talented have difficulty in the longterm. An AI system that combined all of these fields with the benefit of long-run historical understanding might prove to be invaluable for navigating a complex future.
First Steps
Automatic Critique
A frequent problem encountered by historians is needing to read books and consult archives not to learn more, but to ensure that nothing they’ve argued has been contradicted. This process of anticipating reviewers is potentially extremely time consuming, especially since more ambitious books touch on more sub-fields than it is possible to read through. However, it is still good and important, and historical research is surely made better for it.
Rather than having to digest and review an entire archive, current technology could be used to significantly speed up this process. A combination of LLMs and Informational Retrieval technology could be employed to check every statement and argument an archive of user-selected sources. In order to avoid missing potentially relevant passages, the system would have to be calibrated to be overly sensitive, returning many false positives for relevant critique. However, even if the system returns 20 or 50 false positives for every true positive, that still saves the researcher an immense amount of time over reading the entire literature, and increases accuracy well beyond simple skimming.
Automatic critique and commenting has already been explored for fields such as AI Alignment and companies such as Ought are attempting to incorporate these approaches into the scientific research process. Applying it to history would not be overly difficult and would likely help many to save time without reducing the rigor of their scholarship.
Language Understanding
One area I hope to work on as technology allows is the understanding of sources in other languages. Understanding a primary source in its original language is considered absolutely essential for a serious historian. Translations, even very accurate ones done in perfectly good faith, often hide details or assumptions and history has often progressed thanks to small reconsiderations of the meaning of a word or phrase. Good historians know their source language deeply, usually at a level of fluency and philological expertise.
While there is a great amount of research on machine translation, there is not as much about enabling faster understanding of sources at the level of an expert. To understand a sentence of Plutarch, you either read a translation in a few seconds, or you take years of Ancient and Classical Greek. I’d like to help create an option in the middle. It is no barrier for a historian to learn a few languages to fluency and ideally they probably should, but in my ideal world a historian would not be limited by the number of languages they can learn in a lifetime. It would not be strange at all for a historian of Medieval China to have reason to consult sources in Chinese, Japanese, Sanskrit, Pali, Mongolian, Tibetan, and several Turkic languages, but to learn all would require incredible time and learning ability. My goal is to enable a historian to examine a sentence in a language they don’t know and within a few hours become convinced enough of its meaning that they are confident enough to make an argument around word-usage or tone without consulting a colleague fluent in the source language.
I have hinted at the beginning of such a system utilizing LLMs in my earlier post on poetry and will attempt to utilize preexisting tools and new avenues of explanation to better enable language study. The best precedent for this kind of work is Bible study tools (both software and more analog) intended to allow various sorts of pastors to preach using arguments from the Bible’s original languages despite non-mastery of Greek and Hebrew. Classical and Philological methods from the study of Indian, Islamic, Ancient Mediterranean, and East-Asian texts will also certainly come in handy.
The system will have to allow for the examination of each and every word in multiple contexts and in the overall pattern of grammar. In dead languages, a crucial part of language understanding is looking across the existing corpus at comparable uses of words and grammatical structures. AI’s enhanced search ability and summarization should be extremely helpful for this, automatically detecting potentially rare word meanings or stock quotations.
It will also have to access any commentarial traditions which help elucidate particularly important religious texts. A good reading of the Chinese classics should ideally be incorporating a summation of the centuries of commentaries: a neophyte historian making an argument about Sun Tzu should be confronted automatically with any commentators who disagree or agree with a particular reading.
This tool alone would likely be very useful for various applications, including making the reading of texts in their original languages more accessible to non-specialists. I hope that it can do far more, but a blessing of this project is that each step has its own benefits.
The Name
I would tentatively like to name this potential system Sima after the author of the 史記 (Records of the Grand Historian, Shiji), Sima Qian. The obvious name for such a system would be Clio, the Greek muse of history, but this name is already widely used and Sima Qian has unique qualities which make him better suited for a namesake. The 史記 (Shiji) is a work of history which attempted to be totally comprehensive of everything that came before. Sima Qian worked tirelessly to distill all knowledge up to the present, and treated his sources with a truly admirable rigor and skepticism. The resulting work effectively created the standard of history-writing in Chinese civilization. My strong hope is that the methods introduced by new AI tools might be said to do the same for our current civilization.
I welcome any critiques, comments, or collaborators. I will continue to write about related topics on this blog and elsewhere.
Note on the relationship to Cliodynamics
Cliodynamics, popularly associated with its founder Peter Turchin, is an attempt to do history based on methods more commonly associated with the sciences such as mathematical modeling. Sima AI, if properly implemented, will have several critical components devoted to translating large amounts of data into the statistical abstractions favored by Cliodynamics. The two approaches are highly complementary, and as a forward-looking historical methodology I fully approve of Cliodynamics, and I hope that Sima AI will be a useful tool for the discipline. My major concerns with Cliodynamics include that it is overly reliant on low-quality data (due to currently poor methods) and relies on fallible human coding of features that may vary widely based on how trustworthy certain sources are. I also worry that certain complex historical phenomenon might not be easily reducible to simple mathematical models and require much more complex systems to understand. Sima AI would hopefully address all of these concerns, but much of the power of the full system would come from the concepts and objects currently being pioneered by Cliodynamics.
Thanks to the many friends who reviewed drafts of this post and made it far better.
It's great to see you posting again. As a hobbyist researcher who feels the limitations of being monolingual, not just for primary sources but secondary as well. It's not just Nahuatl that forms a language barrier for understanding the Aztecs, a good amount of the research literature on Aztec culture is written in Spanish. An AI system like this might make hobbyist research more common by making it more accessible.