This talk will discuss code anomalies — code fragments that are written in some way that is not typical for the programming language community. Such code fragments are useful for language creators as performance tests, or they could provide insights on how to improve the language. With Kotlin as the target language, we discuss how the task of detecting code anomalies for a very large codebase could be solved using well-known anomaly detection techniques. We outline and discuss approaches to obtain code vector representation and to perform anomaly detection on such vectorized data. The talk will highlight examples of such anomalies found in open source GitHub repositories.
Speakers: Timofey Bryksin