On Feb. 5, 2025, Nazanin Mahjourian, a Mechanical Engineering PhD student at Tech, presented her method for refining dataset labels using vision-language models in this year’s first Artificial Intelligence (AI) Colloquium. Sponsored by MTU’s Center for AI, AI Colloquiums will be held every 2 weeks this semester. “The point of these [AI Colloquiums] is to lower the entry standards for AI, particularly for non-STEM majors,” said the Director of the Center for AI, Dr. Vinh Nguyen. MTU’s Center for AI is a program started in 2022 that acts as a hub for students, colleges, and industry partners to work and share new advancements in AI. Nguyen said, “The point of the Center for AI is to help people get into AI in this fast-changing environment. The goal is democratizing AI and having an AI maker space to help provide students/faculty with the hardware and software.”
Their presentation went over the use of connecting text and images (CLIP) from OpenAI to get labels for a dataset. This dataset was then refined using 2 approaches, the first being a model to refine the multiple labels per data entry into simpler and more accurate labels. These labels were then further classified into groups to help make the dataset more manageable. An example of this during the presentation was taking several different images of bicycles and labeling them as “bikes” associated with their type (examples of these labels are electric bicycles or groups of bicycles) which was then filtered to get rid of any inaccurate labels (such as misspellings) and then these labels were all then classified under the larger group of just “bicycle”. This research will help those who need a way to sort large datasets into simpler classifications or groups, for further development or access. Currently, the main limitation of this method is that the grouping algorithm still has a manual review component, which they hope to automate going forward.
For those who wish to know more information about Mahjourian’s research, please contact her at mahjouri@mtu.edu. If you have any questions about the Center for AI, please contact Nguyen at vinhn@mtu.edu. Lastly, if you would like to know more about the wide range of topics in AI research here at Tech, the next AI Colloquium is scheduled for Wednesday, Feb. 19, from 12 p.m. to 1 p.m. in EERC 216 on deep reinforcement learning and large language models.