Competition Link: https://hackaichallenge.devpost.com/

Overview

Create a cool project
Show off the project
Show how Nvidia AI Workbench made development easier.

Project

Problem

LLMs hallucinate and cannot think critically. They miss simple questions like "How many r's in strawberry" and "Is 2.11 greater than 2.9". We believe this is because they only "think" associatively, and do not have a robust understanding of the world tempered by formulaic, algorithmic "thinking".

To make a metaphor to cognitive science, they are only capable of implicit learning, not explicit learning. Though they learn extremely complex patterns between strings of words and paragraphs, their lack of self-awareness means they cannot directly observe their thinking; thus, they cannot explore how to think more efficiently. This leads to computational bloat, an explosion in the complexity and size of LLMs.

This has many drawbacks, like increased demand on the energy grid and the models being too complex for us to understand their mechanizations. Humans, on the other hand, are imprecise but extremely computationally and energy efficient. Perhaps if we could give models some of our characteristics, chiefly executive function, they could become more efficient and understandable while retaining their superior precision.

Limitations

18 days until submission date

Proposal

Use Rag to serve as working memory for LLM level "thinking" to facilitate proof of concept of executive functioning. Executive functioning includes:

Proficiency in adaptable thinking
Planning
Self-monitoring
Self-control
Working memory
Time management
Organization

Implementation and Measurement of Success

Use RAG to feed the LLM with data sets from English and Math questions, asking it to solve associative and algorithmic problems in both subjects. Level 1: It can accurately answer both types of questions by selecting and performing the correct algorithms on simple questions, and describe how the algorithms work.
E.G. Addition and Subtraction, How many "r"s in "strawberry", etc
Level 2: It can generate compositions of previously seen algorithms to hypothesize on how to solve slightly more complex
E.G. Multiplication is then described to it. Then it is asked to create an algorithm to solve multiplication problems of increasing digits. Then this process is repeated for division
The purpose of this is to train it to hypothesize
Level 3: It can generalize and learn other fields. Like Biology
E.G. Read scientific papers and assist with research based on an accurate understanding of the mechanics of the topic

Judging Criteria

Technological Implementation

Does the project demonstrate quality software development? Does the project leverage NVIDIA AI Workbench? How is the quality of the code? Is there a balanced blend of frontend and backend in the software?

Ease of Use

Does it have a user-friendly interface? How much work is required to get the application in the Project started?

Demonstration

Is the demo video and explanation of the project well thought out? Is it clear how this project showcases NVIDIA AI Workbench features?

Potential Impact

How big of an impact could the project have on the Dell & NVIDIA developer community? How big of an impact could it have beyond the target community?

Quality of the Idea

How creative and unique is the project? Does the concept exist already? If so, how much does the project improve on it?

Resources