CSSE 413: Assessment of ChatGPT
In order to assess the current strenghts and weaknesses of LLMs,
select, assess and present on in-depth experiments with ChatGPT. If
you have access to the most recent version 4.0 that is great, if not,
that will be fine too.
Organizational Aspects
- Please work on the project in pairs.
- You may wish to select a topic about which you know a lot. This
could be a very narrow, unusual topic, such as underwater basket
weaveing, about which there is not a lot of training
data. Alternatively, you may also choose an area for which there is a
lot of training data, such as LinkedLists. Again, ensure that you have
a fair amount of expert knowledge.
- Presentation Schedule
- Conduct a fair number (>= 20) of experiments which build on what
you learned from prior experiments. In other words, do not simply pose
20 random prompts to ChatGPT, instead, pose a well-thought ought
prompt and based on the response, pose a well-thought out follow-up
prompt. See whether you can develop a good line on inquiry. I imaging
that you can keep this up for about 5-8 prompts. Then, start a new
line of inquiry. Conduct the experiments so that
they help you to assess the power and limitations of ChatGPT in your
chosen domain. Please ensure you state which version you are
using. You could also study both of them, to see how much version 4 is
an improvement of version 3.5.
- Write-up your experiments and your subjective evaluation. Please
include some representative sample dialogs you had with ChatGPT. The
write-up should be concise and about 4 pages long, single-spaced.
- You will give an 8 minute class presentation of your key findings
and assessments.