Would Chat GPT Get a Wharton MBA?
1 Co -Brand Name Would Chat GPT Get a Wharton MBA? A Prediction Based on Its P erformance i n the Operations Management C ourse by C hristian Terwiesch ( [email protected]) ABSTRACT Ope nAI’s Chat GPT has shown a remarkable abi lity to automate some of the skills o f highly compensated know ledge workers in general and specifically the knowle dge workers in the jobs held by MBA graduates including analysts, managers, and consultants. Chat GPT has demonstrate d the c ap ability of performing pro fessional tasks such as writing software co de and p repa ring legal documents. The purpo se of this paper is to document how Chat GPT perform ed o n th e final exam of a typical MBA core course, Op erations Management. Exam questions were uploaded as used in a fin al exam setting and then graded. The “academ ic performance” of Cha t GPT can be summarized as follows. First, it does an amazing job at basic operations m anagem ent an d p rocess analysis questions including those that are ba sed o n ca se studies. Not only are the answers correc t, but th e explanations are excellent. Second , Chat GPT a t times makes surprising mistakes in relatively simple calculations at the level of 6th gr ade Math. These mistakes can be ma ssive in magnitude. Third, the present version of Chat GPT is not capable of handling more advanced process analysis questio ns, even when they are based on fairly standard templates. Thi s incl udes process flows with multiple products and probl ems with sto chastic effects such as demand variability. Final ly, Chat GP T is remarkably good at modifying its answers in response to human hints. In other wo rds, in the instances where it initially failed to match the problem with the right solution method, Chat GPT was able to co rrect itself after receiving an app ropria te hint from a human e xpert. Considering this performance, Chat G PT would hav e received a B to B- grad e on the exam. This has i mportant implications for business sch ool education, including the need for ex am p olicies, curriculum design f ocu sing on collaboration between human and AI, opportuniti es to simulate real world decision making processes, the need to teach creative problem sol ving, improved teaching produ ctivity, and more. Please cite as : Ch ristian Terwiesch, “Would Chat GPT Get a Wharton MBA? A Pred ictio n Based on Its Performance in the Operations Management C ou rse”, Mac k I ns titute for Innovation Management at the Wh arton School, University of Pennsylvania, 2023.
2 Introduction The Master of Business Administrat ion (MBA) degree is one of the m ost popular graduate degrees in the wo rld. There exist many elements that toge ther create the “MBA experience,” including admissions, experiential learning, fun activities, networking, and job placement . Tho ugh as a business school professor with 25 years o f t eaching experience my view might be biased, I firmly believe (and very much ho pe) that the acquisition and certification of specific management skills also belongs to this list of impo rtant elements. The value of any skill depends on ho w useful the skill is in the world as well as on how many others are out there mastering the same skill. Prior to the introduction of calculators and other co mput ing devices, ma ny firms employed hundreds of employees whose task it was to manually perform mathematical ope rations suc h as multiplications or matrix inversions. Obvio usly, suc h tasks are now automated, and the value of the associat ed skills has dramatically decreased. In the same way any automation of the skills taught in our MBA programs could pot entially reduce the value of an M BA education. One might argue that OpenAI’s C hat GPT i s the closest that technology has come so far in automating some of the skills of hig hly compensated knowledge workers in ge neral and specifically the kno wl edge w orkers in the jobs held by our MBA graduates including analysts, managers, and con su ltants. Chat GPT has demonstrated a remarkable capability of perfo rming professional tasks such as writing so ftware code (including documentation and run time analysis, Kim 2022). It also performed well in the preparatio n of legal documents and some be lie ve that the next generation of this technology might even be able to pass the bar exam. A s an operations management pro fessor at Wharton and as an author of one of the most widely used operations ma nagement textbooks (Cachon and Terwiesch 2018), I thus was curiou s t o see how Chat GPT would perform o n th e final exam of my Wharton MBA course. To find out, I uploaded some of my exam que stions to Ch at GPT and then graded the respon ses. T he purpose of this paper is to docum ent h ow Chat GPT did on my exam (including a grade !) a nd re flect o n t he implications of this “academic per forma nce”. To preview, here is what I fo und: • Ch at GP T does an amazing job at basic operat ions man age ment and process analysis questions inc luding those that are based on case s tudies. Not onl y are the answers correct, but t he e xp lanations are exc ellent. • As others have argued before me, Ch at GP T at tim es makes surprising mistakes in relatively si mp le calculations at the level of 6 th gra de M ath . These mistakes can be massive in magnitude.
3 • The present version of Chat GPT is not capable o f ha ndlin g more advanced process analysis questions, ev en when they are based on fairly standard templates. This includes process flow s with m ultiple products and pro blems wi th stochastic effects such as demand variability. • Ch at GP T is remarkably good at modifying its answ ers in r espons e to human hints. In other words, in the instanc es where it initially failed to match the problem with the righ t solution method, Chat GPT wa s able to correct itself after receiving an appropriate hint from a human exp ert. Thus, having “a human i n the loop” can be very valuable. Even more remarkable, Chat GPT seems to be ab le to learn over time so t hat in the future the hint is no longer need ed. • Give n th at Chat GP T has demonstrated some crea tivity in pro ducing poetry and song lyrics, I te st ed if it wo uld be able to produce new questions for future e xam s and the next edition of my boo k. T he resulting q uestions were well worded and at times e ven humo rous. Ho wever, they required substantial adjustm ents before becoming usable exam questions. Pa rt 1 of this paper reports on Chat GPT responding to some of my exam questions. Part 2 shows a sample of exam questions generated by Chat GPT . Finally, I articulate a set of implications for MBA programs in Part 3. M any opinion pieces I have seen on this and related topic discuss Chat GPT i n the abstract without showing how it handles specific questions. As many in the business school world, I believe that learning is best done using a specific case setting, in my case Operations Management. Chat GPT has strengths and weaknesses that can be discussed in the abstract, but the specific nuances of the case setting matter. Thus, I encourage even a rush ed reade r to at least skim through parts 1 and 2 as opposed to moving right to the implications. Part 1: Chat GPT taking five exam questions Let m e be gin by showing how Chat GPT did on five exam questions of my course. For each question, I will first p rovide the text that I used for my course’s exam, which I also fed into the Chat GPT prompt line without any adjustment or simplification. Given that most readers will not be familiar with the academic field of Operations M anagement, I will then briefly articulate the specific skills that are needed to answer the question. I will then sho w how Chat GPT answered the question alongside my “professorial” evaluation of the answer, including a grade and some comments. Rea ders wit h some basic business b ackground a re invited to “play a long” a nd spend s ome time thinking a bout the q uestions th emselves b ef ore looking a t any of t he answers.
4 PROCE SS ANALYSIS The first question is one of simple process analysis. The question is based on an Iron Ore refining operation in Trinidad as described in my teaching case Terwiesch and Loch 2002. QUESTION 1 The Circored process produces direct reduced iron in Latin America. The production process starts with a pile of iron ore fines. The ore is moved through seven process steps before it emerges as finished direct -reduced -iron briquettes. The steps and their capacities are as follows: • Pre- Heater: 60 tons per hour • Lock Hoppers: 55 tons per hour • First reactor: this reactor holds up to 14 tons of iron and iron needs to spend 15 minutes in the reactor • Second reactor: this reactor holds up to 200 tons of iron and iron needs to spend 4 hours in the reactor • Flash hea ter: 62.5 tons per hour • Discharge: 59 tons per hour • Briquetting: three machines of 27.5 tons per hour each Where in the process is the bottleneck and what is the overall process capacity? Justify your answer. The key academic concept that is tested in this question is the concept of a bottleneck. The bo ttle neck of a process is the process step with the lowest capacity. For most of the steps, the capacity is give n in t he question. In the case of briquetting, one needs to realize that there are three machines (and thus three times the capacity of one machine). Moreover, the capacity calc ulations fo r the two reactors are not trivial as one needs to derive the capacity by dividing the size of the reactor (e.g., 200 tons) by the reaction time (4 hours). Finally, we should not forget that this question requires a fair bit o f natural language processing to absorb the input and to create the output. So , let’s see what happened when I entered this verbatim int o t he Chat GPT prompt.
5 ANSWER 1 The bottleneck of the process is the second reactor, and the overall process capacity is 50 tons per hour. This is because the second reactor has the lowest capacity and the longest residence time for the iron ore among the seven process steps. The other steps have higher capacities, but they can only work at the rate of the second reactor, so the overall process capacity is limited to this rate. For example, even though the pre- heater has a capacity of 60 tons per hour and the fla sh heater has a capacity of 62.5 tons per hour, the iron can only be processed at a rate of 50 tons per hour in the second reactor, so the overall process capacity is limited to this rate. Wow! Not only is the answer correct, but it is also superbly expl ained. The idea of the bottleneck as the rate limiting step was clearly understood and all calculations were carried out correctly. I don’t see any reasons to take points off from this answer: A+! INVENTORY TURNS My second question tests the concept of inv entory turns and working capital requirements in a business. Here it comes: QUESTION 2 World -Antique is a new venture under development by two graduates of a New England business school. The value proposition of the venture is simple: purchase antique paintings at yard sales and auctions and then sell them at quadruple the purchase price via a simple eCommerce site (i.e., a painting that is purchased for $100 would be sold at $400). Based on an analysis of similar retailers, the founders expect to turn their inventory 1.5 times per year. To achieve an annual revenue of $3,000,000, how much inventory expressed in $’s should the founders expect their operation to require?
6 As mentioned above, this question tests the concept of inventory turns. The m ore often a company turns its inventory, the less working capital is required for financ ing the operations. It is important to recognize that inventory turns are based on the costs of goods sold (COGS, the $100 for procuring the painting), not based on reve nue (t he $400 selling price of the painting). In this que st ion, the inventory turns are given (1.5). To answer t he question, one must go from revenue to CO GS and t hen app ly t he inventory turns to obtain the required working capital. So, what did Chat GPT do? ANSWER 2 To achieve an annual revenue of $3,000,000, the founders need to sell $3,000,000 / 4 = $750,000 worth of paintings per year. Since they expect to turn their inventory 1.5 times per year, they need to have $750,000 / 1.5 = $500,000 worth of invent ory at any given time. Therefore, the founders should expect their operation to require $500,000 worth of inventory. Sh ort and sweet! At some other occasions I fed this question to Chat GPT , I also was informed that this is just the working capital needed for the inventory and that a dditi ona l wo rking capital might be needed for other aspect s of the business. Either way – a second A+ is in order. COMPLEX PROCESS ANALYSIS The first question (the Circored proc ess) involved a process with a single input (iron ore fines) and a single output (briquets). Many processes, however, involve multiple flows. In the busine ss sc hool community, the most commonly used example of teaching such flow problem s with a product mix is the Harvard Business Sc hool case study “N at io nal Cranberry” (Miller and Olsen 1974). I simplified the 11 -page case study to the following :
7 QUESTION 3 Consider the processing plant of a large cranberry cooperative. The plant receives 1,500 barrels of cranberries each hour, 70% of which are wet- harvested berries and 30% are dry -harvested berries. At the plant, dry -harvested and wet- harvested berries are processed differently. All berries are received at a central receiving station. The station has five docks where trucks unload the berries. It takes about 8 minutes to unload one truck. The average truck has about 75 barrels of berries on it. After the receiving station, dry berries are moved to destoning. Wet berries do not have to be destoned. There are three machines at destoning, each of which can process 1500 barrels per hour. Next comes dechaffing, a step that needs to be completed for both dry -harvested and wet -harvested berries. There are three dechaffing machines, each of which can process 1,500 barrels per hour. After dechaffing, the wet- harvested berries move on to the dryer. The dryer can process 600 barrels per hour. Dry -harvested berries do not have to be processed at the dryer. Finally, all berries need to move through the separator. The separator can handle 1,200 barrels per hour. W here in the process is the bottleneck? The ke y difficulty in this question lies in the fact that dry- harve st ed berries and wet- harve sted berries use some of the same resources in the process, but don’t follow exactly the same flow. In othe r wo rds, some resources only process dry- harvest ed berries (the destoning machines), some process only wet- harve sted berries (the dryer), and some process both (the receiving st atio n and the separator). Further, the capacity calc ulation at the receiving station is not trivial as one needs to real ize t hat th ere are five docks, each dock can handle 60/8 truc ks p er hour, an d th at t here are 75 barrels to a truck. However, given how well Chat GPT performed on the first question, t his should be straight forward . Or maybe not? T here is only one way to find out.
8 ANSWER 3 To determine the bottleneck in the cranberry processing plant, we need to identify the step in the process with t he lowest capacity. At the receiving station, there are 5 docks where trucks can be unloaded, and it takes about 8 minutes to unload one truck with an average of 75 barrels of berries. This means that the receiving station has a capacity of 5 docks x 8 minutes/dock x 75 barrels/minute = 300 barrels/hour. At destoning, there are 3 machines that can each process 1,500 barrels per hour, for a total capacity of 1,500 barrels/hour x 3 machines = 4,500 barrels/hour. At dechaffing, there are 3 machines that can each process 1,500 barrels per hour, for a total capacity of 1,500 barrels/hour x 3 machines = 4,500 barrels/hour. At the dryer, the capacity is 600 barrels/hour. At the separator, the capacity is 1,200 barrels/hour. Since the dryer has the lowest capacity at 600 barrels/hour, it is the bottleneck in the process. This means that the dryer will be the step that determines the overall capacity of the plant. All other steps in the process will have capacity to spare, as they can handle more than 600 barrels per hour. This answer of pointing to the dry er as the bo ttleneck is, once again, correct. But, the answ er as a whole at best de serves a C as a grade as Chat GPT made two significant mistakes. First, th e ca pacity calculations at the receiving station might sound plausible, but th ey a re wrong. T he answer given by Chat GPT “This means that the re ceiving statio n has a capacity of 5 docks x 8 minutes/dock x 75 barrels/minute = 300 barrels/hour” does not correctly capt ure the capacity of the receiving statio n. Inst ead, the capacity should be calculated as “5 docks x 60 minutes/hour *1/8 trucks/minute * 75 barrels/ truck = 2812.5 barrels/hour. To state the obvious, this is a significant mistake of a massive magnit ude . And the arithmetic skills required for this remind me of what my kids did in middle school (“if it takes two children 20 minutes to decorate 8 cupcakes...”).
9 Re ma rkably, Chat GPT “got lucky” as the receiving st ation is no t t he bottleneck. Even more remarka bly, it appears that Chat GPT did not believe in its own calculation. After al l, 300 barrels per hour would have made t he receiving station the bottleneck. However, Chat GPT c orrect ly label ed the dryer as the bottleneck, despite it having a higher capacity of 600 barrels per hour. The sec ond mistake in the answer lies in the fact that the calculations and explanations really did not facto r in the product mix of 70% wet- harveste d and 30% dry -harvested berries. It treated the multi pr oduc t plant as a single product plant. To test Chat GPT ’s understanding, I did what as a professo r you can only do in an oral exam: I offer ed a small hint. Specifically, I entered t he following into the prompt line: QUESTION 4 Remember to distinguish between dry -harvested berries and wet- harvested berries . Here is what I got as a response. ANSWER 4 To further analyze the bottleneck in the cranberry processing plant, it is important to distinguish between dry -harvested berries and wet- harvested berries, as the flow and capacity requirements for these two types of berries are different. Since 70% of the berries received at the plant are wet- harvested and 30% are dry- harvested, the capacity requirements for each step in the process will also be different. At the receiving station, the capacity is 300 barrels/hour for both dry -harvested and wet- harvested berries. At destoning, the capacity is 4,500 barrels/hour for dry -harvested berries only. Wet- harvested berries do not need to be destoned, so they do not use any capacity at this step.
10 At dechaffing, the capacity is 4,500 barrels/hour for both dry - harvested and wet - harvested berries. At the dryer, the capacity is 60 0 barrels/hour for wet -harvested berries only. Dry -harvested berries do not need to be processed at the dryer, so they do not use any capacity at this step. At the separator, the capacity is 1,200 barrels/hour for both dry -harvested and wet- harvested berri es. If we consider the capacity requirements for dry -harvested and wet- harvested berries separately, we can see that the dryer is the bottleneck for wet -harvested berries, with a capacity of 600 barrels/hour. However, for dry -harvested berries, the bottleneck is at destoning, with a capacity of 4,500 barrels/hour. Overall, the bottleneck for the cranberry processing plant will depend on the mix of dry -harvested and wet -harvested berries being processed. If the plant is processing a higher proportion of wet- harvested berries, the dryer will be the bottleneck. If the plant is processing a higher proportion of dry- harvested berries, the destoning step will be the bottleneck. Tho ugh this clearly is an improvement, Chat GPT still fails to break up the flow of the 1,500 barrels per hours int o the two sub-f lows of 1050 barrels per hour wet-h arvested and 450 barrels per hour dry h arv ested be rries. I like how it points out that the bottleneck “will depend on the mix of dry-h a rvested and wet-h arv ested berries being processed”. But, the mix is explicitly given as 70:30, so the argument that the answer depends on the mix is no t satisfactory and really de serves no grade better than a B - and that is after receiving a big hint. QUEUEING ANALYSIS An im portant concept in Operations Management relates to variability in demand and in proce ssing time s. If you have 10 customers arrive over t he course of an hour, you cannot assume that a customer arrives exactly every six minutes. Such variability can le ad to congestion and custome r waiting times in a process that has enough capacity on average. For example, a process with a 90% capacity utilization and random arrivals will lead to periods of substantial wait times. The branch of operations management that is concerned with this effect is appropriately referred to as queueing analysis and is well studied in Operat ions Research, Industrial Engine ering and Computer Science. Consider the fo llowing question.
11 QUESTION 5 The Pennsylvania Department of State is implementing a new electronic voting system. Voters will now use a very simple self -service computer kiosk for casting their ballots. If that kiosk is busy, voters will patiently queue up and wait until it is there turn. It is expected that voters will spend on average 5 minutes at the kiosk. This time will vary across voters with a standard deviation of 5 minutes. Voters are expected to arrive at a demand rate of 10 voters per hour. These arrivals will be randomly spread out over the hour (you can assume that the number of voters arriving in any time period follows a Poisson distribution). What is the average amount of time that a voter will have to wait before casting their vote? It is e asy to see t hat , o n average, the operation has a sufficient amount of capacity. 10 voters show up per hour, each voter requires on average 5 minutes . So , in th e averag e hour, the kiosk will be utilized for 50 minu tes a nd be idle f or 1 0 m inutes. We refer to 50/60 as the capacity utilization of the kiosk. How did Chat GPT hand le this que stion? Let’s take a look. ANSWER 5 To find the average amount of time that a voter will have to wait before casting their vote, we need to calculate the average time that the kiosk is occupied. If the average time that a voter spends at the kiosk is 5 minutes, and the demand rate is 10 voters per hour, then the average time that the kiosk is occupied is 50 minutes per hour (5 minutes per voter * 10 voters per hour). The average time that the kiosk is not occupied is therefore 60 minutes per hour - 50 minutes per hour = 10 minutes per hour.
12 Since the time that a voter has to wait is equal to the time that the kiosk is occupied by other voters before them, the average time that a voter will have to wait is equal to the average time that the kiosk is occupied, which is 50 minutes per hour. Clearly, the answer was off to a good start. Ch at GPT understood that the kiosk would be utiliz ed 5 0 mi nutes out of 60 minutes. Then, however, its l og ic fa lls apart. The statement “the average time tha t a voter will have to wa it is equal to the average time that the kiosk is occupied , which is 50 minutes per hour” ma kes no sense at all. To find th e right answer, one must look a t a s tandard equation f rom queuing th eory. T he equatio n for th e average waiting time state s t hat : Average Waiting T im e = Average Processing T ime x Uti lization / ( 1-Utilization). P lugging in an ave rage proce ssing ti me of 5 mi nutes and a n ave rage utilization of 5 /6, w e get: A verage W aiting Time = 5 x (5/6) / ( 1 - 5/6) = 25 minutes. So , th e correct answer i s 25 mi nut es w aiting i n line. I f we add th e 5 mi nutes a t the kiosk, w e obtain a tota l o f 30 minutes. Gi ven tha t at this point I had developed quite an emotional attachment to Chat GPT , I co uld not bel ieve that it would be entirely ignorant o f que ueing theory. So, once again, I p rovid ed a h int. I entered the same ques tion, but this time made an explicit reference to queueing theory: QUESTION 6 Using queuing analysis, what will be the long run average amount of time that a voter will have to wait before casting their vote? Wou ld Ch at GPT be able to take advant age of the hint? Let’s take a look.
13 ANSWER 6 To determine the long -run average amount of time that a voter will have to wait before casting their vote, we can use the formula for the expected waiting time in a single- server queue: E[W] = (Lambda * Wq) / (Mu - Lambda) Where: E[W] is the expected waiting time Lambda is the arrival rate of voters (in this case, 10 voters per hour) Mu is the service rate of the kiosk (in this case, the rate at which voters are served by the kiosk, which is equal to the inverse of the average time that a voter spends at the kiosk, or 1/5 = 0.2 voters per minute) Wq is the average time that a voter spends in the queue (also known as the average q ueue time) Plugging these values into the formula, we get: E[W] = (10 voters per hour * Wq) / (0.2 voters per minute - 10 voters per hour) Solving for Wq, we get: Wq = E[W] * (Mu - Lambda) / Lambda = E[W] * (0.2 voters per minute - 10 voters per hour) / (1 0 voters per hour) So to find the long -run average amount of time that a voter will have to wait, we need to find the value of E[W]. To do this, we need to consider the system as a whole and take into account the fact that the kiosk can become congested wh en there are too many voters trying to use it at the same time. This can lead to an increase in the average waiting time, as voters who arrive when the kiosk is congested will have to wait longer to be served.
14 To determine the long - run average waiting time in this scenario, we need to use more advanced techniques such as simulation or a more advanced queuing model that takes into account the effects of congestion. It looks like Chat GPT made the correct link to queueing theory. How ev er, i t got stuck when trying to fit the numbers into the formula and eventually gave up with the answer by point ing to “sim ulation or a more advanced queuing model that takes into account the effects of congestion”. H ow could I help Chat GPT get to the answer? A ve ry conv enient feature of Chat GPT is that the user can simply ask for a second answer whenever she i s not co nvinced of the solution. So, I simply asked C hat GPT to give it another shot. And, it wo rked, at least almost! Here is the s eco nd resp onse. ANSWER 6 (SECON D TRY) To find the long run average amount of time that a voter will have to wait before casting their vote using queuing analysis, we need to consider the arrival rate of voters at the kiosk and the service rate of the kiosk (the rate at w hich voters are served by the kiosk). The arrival rate is the rate at which voters arrive at the kiosk, which is 10 voters per hour. The service rate is the rate at which the kiosk can serve voters, which is equal to the inverse of the average time that a voter spends at the kiosk, which is 1/5 voters per minute = 12 voters per hour. The long run average number of voters in the system (waiting and being served) is given by the formula: L = lambda / (mu - lambda) where lambda is the arrival rate and mu is the service rate. Plugging in the values, we get: L = 10 voters per hour / (12 voters per hour - 10 voters per hour) = 10/2 = 5 voters
15 The long run average time that a voter spends in the system (waiting and being served) is given by the formula: W = L / lambda = 5 voters / 10 voters per hour = 0.5 hours This is the long run average amount of time that a voter will have to wait before casting their vote. Converting this to minutes, we get: W = 0.5 hours * 60 minutes per hour = 30 minutes. Note that the answer is slightly off. As Chat GPT correctl y computed, there will be 5 voters in the system (waiting and being served). So, the correc t answer fo r the wait ing time before casting the vote should be 25 minutes. But this comes close enough to a passing grade, say a solid B. A gain, this is after receivi ng a hint – the i nitial answer was more in th e C range. One more note on this question: when I retur ned t o Cha t GPT the next day asking the same initial que ueing question (the one without the hint referencing queue ing theory), Chat GPT imme diate ly answered the question correctly – a hint was no longe r ne cessary. It either is capable of learning from past feedback or I just got luc ky (as we saw above, there se ems to exist some randomness in the quality of the answers). TOYOTA P RODUCTION SYSTEM In addition to questions involving calculations, some of my exam quest ions are m ore co nceptual in nature. Often t imes, such conceptual questions are in a multiple choice format. So , my final question looks at the effect of using Kanban card s to move work through a proce ss. QUESTION 7 Which of the following statements is correct? I. Deploying the Kanban system leads to pulling work through the system instead of pushing work. II. Deploying the Kanban system leads to pushing work through the system i nstead of pulling work. III. The Kanban system controls the work in process inventory.
16 IV. Kanban will always reduce the amount of inventory in the process V. Kanban requires cross -training all workers. The Toyota P roduction Sys tem with i ts focus on flow a nd just -in-time delivery is an important p art of a ny operations cou rse or b ook and th e Kanban meth od is a k ey con cept of it. Kanban cards are a replenishment mechanism in a process where a resource (machine or operator) only produces some thing when there exists an explicit demand for it. This demand could come from the customer (e xt ernal demand) or from the subsequent resource in the process (int ernal de mand). Either way, work is now pulled t hro ugh the process, and it is demand that drive s t he f low, no t mac hine or operator availability. As we can s ee, Cha t GPT had no problems wit h this q uestion. ANSWER 7 The correct statements are: I. Deploying the Kanban system leads to pulling work through the system instead of pushing work. III. The Kanban system controls the work in process inventory. IV. Kanban will alw ays reduce the amount of inventory in the process (if implemented correctly). Kanban is a method for managing the flow of work in a production process. It is based on the principles of just- in-time production, which aims to minimize inventory and waste by only producing what is needed, when it is needed. In a Kanban system, work is pulled through the process as it is needed, rather than being pushed through the process regardless of demand. The Kanban syste m helps to control the amount of work in process inventory by using visual signals (such as cards or electronic displays) to indicate when work should be started and when it is ready to move to the next stage in the process. Kanban does not necessarily require cross -training all workers, although it may be beneficial to have a more flexible workforce that can handle multiple tasks.
17 This is a really good answer. We can and should debate the validity of statement IV. When I wrote this questio n f or my students, I intended this to be a “trick question” and the statement to be fal se (if one puts too many K anban cards into the process, strictly speaking, inventory could go up). But it appears a s if Ch at GPT sa w through my evil intent and wisely added the clause “ if im pleme nted correctly”. The explanation is once again excellent, including the final stateme nt about the usefulness of cross -tr aining. A+. SUMMARY The p revious pages should convince even a skeptical reader that Chat GPT ha s rem arkable skills in handling problems as used extensively in the training and te sting of our MBA students. Combining the results of the questions, I would grade this perfo rmanc e as a B to B -. The following reference point might help put this performance into perspective. Until Whar ton allowed students more flexibility in which courses they take, this Operations Management course was a required course that eve ry student had to take. However, we did allow stu dents to waive this course if t hey could demonstrate content maste ry o n a wai ver exam. The performance of Cha t GPT reported above would hav e be en sufficient to pass the wai ve r exam, though by a very small ma rgin. Part 2: Chat GPT Generating Exam Questions After exploring to what extent my students might be better off with this technology, I also wante d to find out if I as a professor and author c ould take advantage of it. I know that only few readers (most likely fellow instructors) will pity me for this, but I face the challenge of coming up with new te st questions every year and for e very new e dition of my book. By now, I have written 1000’s of questions and, at time s, I feel I have exhausted my ima gination for new problems. C an I turn to Chat GPT to come up with new exam questions? To find out, I ente red the fi rst process analysis que stion (the Circored iron ore process) into th e Cha t GPT prompt alongside with the foll owing instruc tion: QUESTION 8 Create similar questions in the context of a coal mine. He re is the response, i.e., the question generated by Chat GPT .
18 ANSWER 8 In a coal mining operation, the production process starts with extracting coal from underground seams. The coal is then moved through six process steps before it emerges as finished product. The steps and their capacities are as follows: Crusher: 50 tons per hour Conveyor belt: 45 tons per hour Wash Plant: this plant holds up to 20 tons of coal and coal needs to spend 1 hour in the plant Dryer: holds up to 100 tons of coal and coal needs to spend 3 hours in the dryer Screening: 62.5 tons per hour Loading: 59 tons per hour Where in the process is the bottleneck and what is the overall process capacity? This a ctually is a pretty good question. Chat GPT picke d up th e format and the logic of the iron process, but wa s able to transfer the question into the coal domain. I am b y n o mea ns an expert on coal mining a nd processing, b ut these steps exist in coal plants and their sequence is at le ast plausible. To replicate this f o r another q uestion ty pe and to add s ome humor to this p aper, I a lso requeste d a w aiting t ime que stion ( se e the voting op eration d isc ussed a bove), o ne that involves a v ery s pe cific context. QUESTION 9 Generate three queueing theory questions about average waiting times in the context of Homer Simpson's day to day life. He re is the response by Chat GPT .
19 ANSWER 9 "At Moe's Tavern, Homer Simpson and his friends often wait in line for a drink. On average, customers spend 2 minutes at the bar and arrive at a rate of 20 customers per hour. The bartender takes an average of 3 minutes to serve each customer. What is the average waiting time for Homer and his friends before they are served a drink?" "Homer Simpson spends a lot of time waiting in line at the Springfield DMV. On average, customers spend 5 minutes at the counter and arrive at a rate of 30 customers per hour. The DMV employees take an average of 10 minutes to serve each customer. What is the average waiting time for Homer before he is served at the DMV?" "Homer Simpson likes to buy donuts at Lard Lad Donuts. On average, customers spend 1 minute at the counter and arrive at a rate of 15 customers per hour. The cashier takes an average of 2 minutes to serve each customer. What is the average waiting time for Homer before he is able to buy donuts at Lard Lad Donuts?" Clearly, t hese are plausible questions and likely are more humorous than anything I could have produced m yself (t hough, given that I am a German Operations Manageme nt profe ssor, one might argue that this is not a high bar to clear). It is certainly goo d enough that I will take advantage o f this technology in the future. Nevertheless, two shortcomings in these questions need to be pointed out. First, t he question really should provide only two numbers, the arrival rate (demand, 20 custo mers per hour in t he first question) and the time to serve one customer (3 minutes in the first question). I have no idea why the additional number (“customers spend 2 minutes at the bar”) is provided and how it should be interpret ed in the context of the question. Seco nd, for queuing theory being the right tool fo r the job, the de mand for the process (the rate of customer arrivals) should be somewhat close to the capacity of the proce ss (t he rate of service). Consider the second question. 30 customers arrive per hour, each o f the m requiring 10 minutes of work. Clearly, o ne DMV employee would not be able to deal with this loa d a nd m ultiple employees would be needed. Though the que stion generated by C hat GPT speaks of “DMV employees” (plural), it does not state how many there are, ma king it i mpossible to answer the question.
20 Part 3: Implications for MBA Programs and Faculty I a m no t the f irst to speculate about the impact of Chat GPT o n educ atio n. However, I propose that the impact of Chat GPT on business school education in g eneral and Operations Management in particular goes beyond wh at i ts impact will be on teaching mathematics, history, biology, or lit erature. “O pe rations” has its roots in the Lati n word “opus”, which stands for “ work”. Th e purpose of my Operations Mana gem ent c lass thus is about he lping students analyze and improve th e way people work, now and in t he fu ture. The science of b iology has and wi ll not ch ange because of Chat GPT . How people work, in contrast, is constantly changin g a s tech nolog y ad vances. Based on the B to B- pe rforma nce of Chat GPT in my cours e and its ability to generate creative (though imperfect) questions for m y future e xams, I see the following implications for us as business school f aculty. Implication 1: Be mindful of what Cha t GPT can and cannot do The moment I saw the ans we r to my first question, I fel l in love with Chat GPT . I had used other n atural language processing and AI software before, but this simple user experienc e and the great answer put me in a state of awe, and I am sure it has im pressed many users before me. But we should not forget that it made major mistakes in some fairly simple situat ions. Being off by a factor of 10x in t he r eceiving station of Question 3 is below the academic performance of a middle schoo l stud ent. The ave rage grade of Chat G PT was a B to B- in a domain that is well documented in thousands of pi eces o f kno wledge that are accessible online. We have many re asons to believe that the technology i s g etting b etter ov er time. But, w e are still f ar from an A + for comp le x problems a nd we still n eed a h um an in t he loop. Im plication 2 : C ont inue to teach th e fo undations I am sure that there will be many calling for a change in course content making an argument of the type “if a computer can do it at zero marginal cost, a st udent sho uld not need to spend time and money on mastering this skill” or “if a bot can pass the waiver exam of a course, clearly these skills should be removed from the curriculum!”. I have some sympathy for this argument. 35 ye ars ago, as an undergraduate student in Germany, I learnt how to manually invert a matrix and ho w to solve simple optimization problem s wit h nothing but pen and paper. Aft er succ essfully displaying these skills on my final exam, I have never used the m again.
21 But I would not go as far as making the claim that these skills we re a c om plete waste of my time. In my vie w of education, an elementary school student still needs to learn that 7 x 7=49 and that the c apital of Pennsylvania is Harrisburg, even though calculators have been widely used fo r over 50 years and students can use Google or Wikipedia to find answers for most factual que st ions. It is the nature of foundational skills that the y are required to comprehend more advanced to pics. How does the increasing market volatility post Covid impact the suitability of a just- in-time supply chain? Should US manufacturers with most of their suppliers located in China embrace a dual sourcing st rate gy? To c ompetently answer these questions one needs a so lid unde rstanding of the foundations of o pe rations management. You need to be able to walk before you can run! So, business school faculty or eleme ntary school teachers – as educators we still must teac h the foundations. Imp lication 3: Deal with the cheating when testing foundational knowledge M any educators are interested in the Chat GPT d iscussion o ut of a concern that their students might be c heating on homework assignments and final e xams. They sho uld be. Though in the past I have had an open bo ok, open notes policy for my exams and I allowed students to use computing devices, I will now join thousands of professors and teachers and explicitly ban the usage of Chat GPT and oth er tec hno logies of this type for the purpose of homework assignments and the final exam in my foundations courses . I re alize that regulating exciting new technol og y i s oftenti mes perceived as a desire to hang on to the status quo. N evertheless, allowing a student to ac cess Ch at GPT during an exam that tests facts and found atio nal co ncep ts is like allowing t he student to call a friend with an average academic competenc e and take the exam fo r h er. Reliable tests play an important role in teaching and skill certification and this should not be compromised be cause of a new technology. Im plication 4 : M im ic th e workplace by tea ching h ow to evaluate a p roposed plan of a ction M BA programs are professional degrees that pre pa re stud ents f or careers in the business community. Techn ologies like Chat GPT are a lready used in the workplace and that usage is only going to increase. Their ultimate goal is to improve managerial decision making. Managerial decision making invo lves creating a set of alte rnatives and then critically reflecting which alternative is the best for the situation at hand. As our MBA graduates advance in their careers, they will make more and more of these dec isions in groups where alternatives will be generated by consultants, co- wo rkers, and direct reports. The skill of looking at a suggeste d
22 alternative that is well presented and looks totally plausible and then being able to critically evaluate if the su ggested alternative is fundamentally flawed or absolutely brill iant i s t hus a mong the mo st impo rtant skills in a m anagement career. With Chat GPT playing the role of that smart consulta nt (who always has an elegan t answer, but oftentimes is wrong) we thus have a pe rfect training ground for developing that skill. Just think back to Answer 3 (the Cranberry process). It was well presented and t he numb ers looked coherent and plau sible – but, it was wrong nevertheless. By letting stu dents use Chat GPT during case discussions, I can thus emulat e the de cision making process a senior ex ecutive would face in the work pl ace. Im plication 5: Let students use Chat GPT , but simultaneously raise the bar for as signme nts The scenario that most K12 teachers are concerned about is that a student who in the past would go to the library a nd s pen d four hours on a homework assignment contrasting the views of Albert Camus and J ean-P aul S artre on exi stentialism is now “getting a wa y” w ith a five minute interaction with C hat GPT . As a result of this “ sho rtcut” th e student learns less than before. If I wo uld be a K12 teacher, I would be concerned about my stud ents taking such shortcuts as well. B ut, I am not a K12 teacher. I have the privilege of teaching highly motivat ed s tudents, most of whom are making a substantial personal and financial sacr ifice to b e i n my cla ssroom. With or without Chat GPT , I can get a certain amount of time from them each week (say f our ho urs). It i s now up to me to come up with assignments that are challenging enough so tha t t hey warrant that time investment. To t he extent that we believe that C hat G PT gi ves the students a head start on their homework, it is my j ob to ho ld them acco untable to a higher standard. In many ways, this is similar to how we have been using group assignments for many years. Just as we would expect a bett er del iverable for an assignment that was given to a group of five students, we should expect mo re f rom a stu dent that we encourage to colla bo rate with a technology such as Cha t GPT . Im plication 6 : A sk students to imagine the new ra ther th an tw eaking th e old In his book “Zero to One”, Peter Thiel challenged entrepreneurs to move beyond the existing paths of innovation. Thiel distinguishes between “going from 0 to 1”, which corresponds to creating something fundamentally new (Bill Gates coming up with a PC operating system) and im proving upo n what exists by going from “1 to n” (moving from Windo ws 10 to Windows 11).
23 The knowledge of Chat GPT is inh erently built on synthesis. Even when we ask it to engage in creativ e probl em so lving (recall Question 7), Chat GPT will always s tay in a soluti on space that is defined by what it has seen in the past. Outsta nding b usiness ideas, in contrast, oftentimes go beyond optimizing what is and mo ve to imag ining what c ould be. For example, Chat GPT and it s successors will likely excel in finding the optimal delivery path for a truck that has to make a given numb er of shipments to a given set o f ad dresses. But will it be able to qu estion the problem? Will it challenge the mode of delivery? I apprec iate the need to teach our students how to find the shortest path through a network. But wouldn’t the A+ st ude nt come up with an idea such as a strategically po sitioned locker that customers walk to in order to pick up their shipme nt or use a drone for delivery? It is up to us as educators to reward, if n ot require, such out of the bo x th inking . Implication 7: Don’t be shy us ing Chat GPT to improve the productivity o f th e teaching process Nine years ago, I was among the first facult y to make my MBA course content available to the general public by creating a MOOC. Video based teaching allowed for enormous increases in teaching product ivity (Te rwiesch and Ulrich 2014), something that at first, many of my students and faculty colleagues were skeptical about. The st rategic decision, of course, was who should benefit from that highe r productivity. I now use these videos to help st udents who have missed class or are struggling for other reasons. This allows me to customize learning in a way I c ould not do before. Using this video content has resulted in improve d learning and has reduced student anxiety. It has also increased test scores. Re call my experiment with having Chat GPT create exam questions for me. It normal ly ta kes me about 20 hours of work to create an exam and anothe r 10 hours for TA’s to test the ex am and write solutions to it. N ow, I think we could get my exam writing t im e down t o 10 hours and the TA time down to 5. In other words, we have a 100% productivity increase in the “exam writing operation”. It is easy to imagine similar magnitudes of improvement in grading processes, t ut oring, and office hours. It is no w up to us to determine what to do with this increased productivity. In my view, we should return it to the students in the form of extra meetings outside class, personal attention, joint social activities, or the design of ne w course materials.
24 Afterthought Gi ven my new relationship with Chat GPT , I can’t help but turn to the computer one last time and ask my new di gital f riend what it thinks i ts i mpact on M BA ed ucation w ill b e the in th e f uture. QUESTION 10 Wh at will be the impact of the Chat G PT technology on how MBA students are taught. Specifically, how will Chat GPT impact MBA education differently from K12 education? Look at t he answer and draw whatever conclusion you like. ANSWER 10 Chat GPT technology has the potential to greatly enhance MBA education by providing students with personalized, interactive learning experiences. It can be used to generate personalized study materials, provide real -time feedback, and facilitate online discussions and group projects. Additionally, Chat GPT can be us ed to simulate real-w orld business scenarios, allowing students to practice decision -making and problem- solving in a safe, virtual environment. The impact of Chat GPT on MBA education may be different from its impact on K-1 2 education because MBA students are typically older and more experienced, and are often working professionals. They may have different learning styles and needs than K -12 students, and may require more advanced, specialized content. Additionally, MBA programs often involve a greater emphasis on practical, real-w orld applications, which Chat GPT may be particularly well-s uited to support.
25 References Terwiesch, Christian, Karl T. Ulrich, “Will Video Kill the Classroom Star? The Threat and Opportunity of MOOCs for Full -Time M BA Programs”, White Paper, Mack Institute 2014 Miller, Jeffrey G. "National Cranberry Cooperative." Harvard Business School Case 675 -014, August 1974. (Revised November 1983.) Terwiesch, Christian, Christoph H. Loch, “Pumping Iron at Cliffs and Associates: The Circored Iron Ore Reduction Plant in Trinidad”, Case of the Wharton -INSEAD Alliance, 2004 Cachon, Gerard, Christian Terwiesch, Matching Supply with Demand: An Introduction to Operations Management , 4th edition, McGraw Hill, 2018 About the Author Christian Terwiesch is the Andrew M. Heller Professor at the Wharton School of the University of Pennsylvania. He is a Professor in and the chair of Wharton’s Operations, Information, and Decisions department, co- director of Penn’s Mack Institute for Innov ation Management, and also holds a faculty appointment In Penn’s Perelman School of Medicine. His research on Operations Management and on Innovation Management appears in many of the leading academic journals ranging from Management Science to The New Eng land Journal of Medicine. He is an award winning teacher with extensive experience in MBA teaching and executive education. Professor Terwiesch is the co- author of Matching Supply with Demand , a widely used text- book in Operations Management that is now in its third edition. Based on this book, Professor Terwiesch has launched the first Massive Open Online Course (MOOC) in business on Coursera. By now, well over half a million students enrolled in the course. His first management book, Innovation Tournaments, was published by Harvard Business School Press. The novel, process -based approach to innovation outlined in the book was featured by BusinessWeek, the Financial Times, and the Sloan Management Review and has lead to innovation tournaments in organizations around the world. His latest book, Connected Strategies, combines his expertise in the fields of operations, innovation, and strategy to help companies take advantage of digital technology leading to new business models. The book has
26 been featured as th e cover story of the Harvard Business Review and has been featured by Bloomberg / BusinessWeek as one of the best books in 2020. Professor Terwiesch has researched with and consulted for various organizations. From small start- ups to Fortune 500 companies, he has helped companies become more innovative, often by implementing innovation tournament events and by helping to restructure their innovation portfolio. He holds a doctoral degree from INSEAD and a Diploma from the University of Mannheim.