Quarter
Faculty Reference
Tevfik Bultan
Course Type
Course Area
Foundations
Enrollment Code
62075
Location
Phelp 2510
Units
4
Day and Time
TR 11:00-12:50
Course Description

String manipulation is a crucial part of modern software systems; for example, it is used extensively in input validation and sanitization, and in dynamic code and query generation. The goal of string-analysis techniques is to determine the set of values that string expressions can take during program execution. String analysis can be used to solve many problems in modern software systems that relate to string manipulation, such as: (1) Identifying security vulnerabilities by checking if a security sensitive function can receive an input string that contains an exploit; (2) Identifying behaviors of JavaScript code that use the eval function by computing the string values that can reach the eval function; (3) Identifying html generation errors by computing the html code generated by web applications; (4) Identifying the set of queries that are sent to back-end database by analyzing the code that generates the SQL queries; (5) Patching input validation and sanitization functions by automatically synthesizing repairs.

Like many other program-analysis problems, it is not possible to solve the string analysis problem precisely (i.e., it is not possible to precisely determine the set of string values that can reach a program point). However, one can compute over- or under-approximations of possible string values. If the approximations are precise enough, they can enable us to demonstrate existence or absence of bugs in string manipulating code. String analysis has been an active research area in the last decade, resulting in a wide variety of string-analysis techniques. Some of the topics we plan to discuss in this course include grammar-based string analysis, automata-based symbolic string analysis, string constraint solving, string abstractions, relational string analysis, vulnerability detection using string analysis, differential string analysis, and automated repair using string analysis.