Enabling Effective Metamorphic- Relation Generation by Novice Testers: A Pilot Study
IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC), July 2024
Yifan Zhang, Dave Towey, Matthew Pike. 2024. Enabling Effective Metamorphic- Relation Generation by Novice Testers: A Pilot Study. In IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC). DOI:https://doi.org/10.1109/COMPSAC61105.2024.00384
Yifan Zhang and Dave Towey and Matthew Pike. (2024). Enabling Effective Metamorphic- Relation Generation by Novice Testers: A Pilot Study. IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC). https://doi.org/10.1109/COMPSAC61105.2024.00384
Yifan Zhang and Dave Towey and Matthew Pike. "Enabling Effective Metamorphic- Relation Generation by Novice Testers: A Pilot Study." IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC), 2024. https://doi.org/10.1109/COMPSAC61105.2024.00384
Yifan Zhang, Dave Towey, Matthew Pike. 2024. Enabling Effective Metamorphic- Relation Generation by Novice Testers: A Pilot Study. IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC). doi:10.1109/COMPSAC61105.2024.00384
Yifan Zhang and Dave Towey and Matthew Pike, "Enabling Effective Metamorphic- Relation Generation by Novice Testers: A Pilot Study," IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC), 2024. doi: 10.1109/COMPSAC61105.2024.00384
@inproceedings{compsac-2024-5,
title={Enabling Effective Metamorphic- Relation Generation by Novice Testers: A Pilot Study},
author={Yifan Zhang and Dave Towey and Matthew Pike},
booktitle={IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)},
year={2024},
doi={10.1109/COMPSAC61105.2024.00384}
}
Metamorphic testing, Novice testers, Metamorphic relations, Autonomous driving system, Driving scenarios, Large language models, Artificial intelligence
Abstract
This paper presents a pilot study that examines the capacity of novice testers to generate Metamorphic Relations (MRs) for autonomous driving systems (ADSs), specifically fo-cusing on parking functions. By comparing MRs generated by human participants with those generated by artificial intelligence (AI), we seek to understand the variances in quality, particularly in terms of correctness, applicability, novelty, and utility. Our findings indicate that despite receiving only minimal training, human participants were capable of producing MRs with a wide range of effectiveness. Notably, humans exhibited a potential for creative thinking, contrasting with AI's ability to generate MRs that adhere closely to technical and applicability standards. The study underscores the need for improved educational strategies aimed at enhancing the quality and confidence of MRs produced by humans. Future research directions will explore the optimization of training approaches, particularly within a constrained timeframe to create a positive learning experience and maintain participant engagement, to fully harness the creative capabilities of human learners in the context of ADS testing.