Beyond Virtual: ChatGPT’s Role in the Residency Application Cycle

Intended Audience

Medical students, faculty, and staff who participate in the residency application process.

Objective

The reader should have a better understanding of the ways that students, faculty, and programs may be using generative AI technologies as a source of ideas, content, and labor throughout the residency application cycle.

Key Learning Points

Tools like ChatGPT can be used to produce or refine the many forms of written content that make up a residency application, from emails and recommendation letters, to personal statements and academic manuscripts.
Medical students are finding creative ways to use generative AI to study, to develop ideas, and to make written communications more polished and efficient.
Programs and faculty require a working understanding of the capabilities of these tools in order to guide trainees through the inevitable challenges associated with implementation of a new technology.
As the application of generative AI expands, in-person interactions become increasingly critical as a component of comprehensive evaluation of candidates for surgical residency.

Over the last year, artificial intelligence (AI) tools like ChatGPT have become increasingly woven into the fabric of daily life. As the residency interview season continues, it is important to consider the implications of these AI tools and how they may be used by applicants, letter writers, and programs as a source of content, ideas, and labor. In this paper, we explore the current landscape of AI-related policies, applicant perspectives, and strategies for managing this new set of issues for this application season and beyond.

ChatGPT, an AI model developed by OpenAI, operates on a principle of word prediction, learning from vast volumes of text data to string together the next most likely word to create coherent sentences. The result is surprisingly complex written content generated in seconds from a simple prompt. This content can then be refined with more detailed requests.

While impressive, this generative ability does not necessarily reflect understanding or reality. ChatGPT’s responses are driven by its training data, such that answers reflect the limitations, inaccuracies, and even potential biases of its source content. In its current form, it can access the internet but does not necessarily provide real-time information, and by design it does not provide information that OpenAI deems potentially harmful, including any information about individual people. It also tends to produce long-winded, vague, and heavily caveated answers.^1–3

The emergence of AI as a potential tool for surgery research and education is viewed by some as a cause for alarm. Many leading journals in academic surgery have adopted a cautious stance, discouraging the use of AI models like ChatGPT to produce academic writing.^4,5 Similarly, many medical schools have now instituted policies that discourage, or outright ban, the use of AI for academic assignments. This reflects attitudes in academia and in public discourse more broadly, with numerous articles appearing in the lay and academic press regarding AI tools and their impact on education at all levels, as well as on industries that rely heavily on written work, including technical writing and legal professions.^6,7

In response, some are implementing new tools to attempt to detect use of AI, similar to existing software programs in wide use at academic institutions to detect plagiarism, though the results of these efforts have been imperfect.⁸ While plagiarism detection software typically compares a new piece of writing to existing published works, attempting to compare a piece of writing to what can be generated by AI is more challenging given the wide variation in AI output based on slight differences in phrasing or even a given ChatGPT users’ chat history. For example, users can ask ChatGPT to generate multiple versions of a given piece of writing based on the same prompt and then choose among different sections. It is also possible for users to provide ChatGPT with a sample of their own writing and ask for a new piece of writing in the same style or voice. Given this complexity, it seems unlikely that an “AI detector” will reach a level of accuracy to be useful in the near future, if at all.

Medical students are at the front line of this change and have the potential to benefit from its implementation, as well as to run up against the growing pains associated with the integration of a new technology across numerous realms, including in academic writing and publishing, professional communication, and their own learning experiences. In the interest of understanding their perspectives and experiences, we completed structured interviews with medical students from our institution who are applying to plastic surgery residency this year. As expected, applicants are aware of AI tools and many are beginning to use them for study, preparation, and content generation. For example, some students reported using ChatGPT to help them make sense of complex concepts by asking for simplified explanations or to structure information into tables or lists to use as study guides. Additionally, some students reported using ChatGPT to help write first drafts of school assignments, generate potential titles for manuscripts, or proofread their emails or cover letters. While students generally felt positively toward AI tools as useful adjuncts for their activities, they also expressed concern regarding certain potential uses, particularly the possibility of faculty using AI tools to write recommendation letters or review applications. These medical students’ responses are summarized in Figure 1.

While every new technology introduces its own set of challenges, we believe that it is in line with the values of our profession to embrace innovation. In the realm of academic collaboration, using tools like ChatGPT can be likened to the initial stages of mentorship or partnership. Just as a student or resident might draft a preliminary manuscript under the guidance of a senior researcher, ChatGPT can efficiently provide a foundational structure upon which more refined writing can be built. It provides a starting point for iterative refinement or a springboard for ideas, much like the rough drafts or initial insights of a novice researcher.

However, just as these partnerships with students and residents require oversight, expertise, and revision, ChatGPT-generated content generally does not produce a perfectly polished product, particularly in the setting of technical academic writing. Thus, while ChatGPT can be extremely useful in the early phases of academic writing, or for refining language and detecting errors at the end, it complements—rather than replaces—expertise and critical thinking. It also has the potential to streamline and expedite the writing process, potentially allowing users to redirect their energies toward idea generation and refinement rather than drafting and wordsmithing.

The rise of AI tools and virtual interfaces—coupled with the declining use of numerical measurements such as grades and test scores in medical education—has created a challenging environment for surgical residency programs and applicants. The COVID-19 pandemic necessitated a pivot to virtual interviews, making the application process more accessible, if less nuanced and personal. While objective information about programs can be easily conveyed via virtual means, the less tangible human aspects of the interview process have not translated as readily. The opportunity to meet applicants in person—and for applicants to experience programs firsthand—has become even more valuable amid the rise of AI tools that make verbal polish and professional written communications easy to simulate.

The extent to which AI will become integrated into the surgical residency application process remains to be seen. However, its potential as an educational resource, a source of ideas, and a tool to increase productivity and efficiency warrants exploration, even in the face of ongoing concerns regarding attribution and authorship. Students will likely continue to explore the potential of this new technology parallel to their surgical education and should be encouraged to exercise their curiosity safely and transparently. In the interest of this goal, it is just as critical for programs and faculty to understand these tools and their potential uses in surgery research and education.

Figure 1: Summary of Applicant Perspectives on Use of AI Tools during Interview Season

Communication about AI Tools from Medical Schools and Organizations

No official guidance from ACAPS, AMCAS, or NRMP
Medical schools advise against using AI, especially in applications
Concerns about potential HIPAA violations when using AI in medical contexts
ERAS/PSCA application requires applicants to confirm that they did not use AI/ChatGPT to write essays

Usage of AI Tools for Study and Prep

Used for case prep, especially for unfamiliar topics
Useful in understanding general concepts before diving into textbooks
Caution advised, not to be overly reliant or trust specifics

AI Tools in Preparing for Application and Interview Season

Some have tried using AI for outlines, but the specifics were not satisfactory without significant editing
Used to generate practice interview questions and to brainstorm responses

Knowledge of Other Applicants Using AI

Uses among peers include draft creation for letters, research tasks, medical school written assignments, and case prep.
General agreement on the utility of AI but caution against replacing personal effort

AI Tools in Drafting Manuscripts

Using AI for brainstorming or angles for interpreting results seen as useful
If used verbatim, it could be considered plagiarism
Ethical use: Aid in phrasing, transitions, structure, but start and end with one's work

AI Tools in Writing Recommendation Letters

Potential for generic and inauthentic letters
Some faculty have mentioned using AI for this purpose.
Concerns about authenticity and potential errors

AI Tools in Reviewing Applications

Caution against AI selecting applicants due to its current limitations
Potentially used for initial screening
Proper application of the right AI tool is crucial

AI Tool Usage during Residency

Expected to be useful in case prep, especially as future versions evolve
Can be used for drafting notes or templates
Anticipated utility in early residency

References

Johnson D, Goodman R, Patrinely J, et al. Assessing the Accuracy and Reliability of AI-Generated Medical Responses: An Evaluation of the Chat-GPT Model. Res Sq. Published online February 28, 2023:rs.3.rs-2566942. doi:10.21203/rs.3.rs-2566942/v1
Gravel J, D’Amours-Gravel M, Osmanlliu E. Learning to Fake It: Limited Responses and Fabricated References Provided by ChatGPT for Medical Questions. Mayo Clinic Proceedings: Digital Health. 2023;1(3):226-234. doi:10.1016/j.mcpdig.2023.05.004
Bhattacharyya M, Miller VM, Bhattacharyya D, Miller LE. High Rates of Fabricated and Inaccurate References in ChatGPT-Generated Medical Content. Cureus. 15(5):e39238. doi:10.7759/cureus.39238
Weidman AA, Valentine L, Chung KC, Lin SJ. OpenAI’s ChatGPT and Its Role in Plastic Surgery Research. Plastic and Reconstructive Surgery. 2023;151(5):1111. doi:10.1097/PRS.0000000000010342
Shaffrey EC, Eftekari SC, Wilke LG, Poore SO. Surgeon or Bot? The Risks of Using Artificial Intelligence in Surgical Journal Publications. Ann Surg Open. 2023;4(3):e309. doi:10.1097/AS9.0000000000000309
School SL. GPT-4 Passes the Bar Exam: What That Means for Artificial Intelligence Tools in the Legal Profession. Stanford Law School. Published April 19, 2023. Accessed August 23, 2023. https://law.stanford.edu/2023/04/19/gpt-4-passes-the-bar-exam-what-that-means-for-artificial-intelligence-tools-in-the-legal-industry/
Marche S. The College Essay Is Dead. The Atlantic. Published December 6, 2022. Accessed August 23, 2023. https://www.theatlantic.com/technology/archive/2022/12/chatgpt-ai-writing-college-student-essays/672371/
Fowler GA. Analysis | We tested a new ChatGPT-detector for teachers. It flagged an innocent student. Washington Post. https://www.washingtonpost.com/technology/2023/04/01/chatgpt-cheating-detection-turnitin/. Published April 14, 2023. Accessed August 23, 2023.

Authors

Elizabeth L. Malphrus, MD, MPP (corresponding author)

Plastic surgery resident, PGY-4
University of Pennsylvania Health System

Elizabeth.malphrus@pennmedicine.upenn.edu

William Piwnica-Worms, MD

Plastic surgery resident, PGY-4
University of Pennsylvania Health System

Joseph M. Serletti, MD, FACS

Chief, Division of Plastic Surgery
Henry Royster-William Maul Measey Professor in Plastic and Reconstructive Surgery
University of Pennsylvania Health System

Joshua Fosnot, MD

Program director, Penn Plastic Surgery Residency
Associate professor of clinical surgery
University of Pennsylvania Health System