
Frank Tip- University of Waterloo
Frank Tip
- University of Waterloo
About
128
Publications
20,365
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,972
Citations
Introduction
Skills and Expertise
Current institution
Publications
Publications (128)
In mutation testing, the quality of a test suite is evaluated by introducing faults into a program and determining whether the program’s tests detect them. Most existing approaches for mutation testing involve the application of a fixed set of mutation operators, e.g., replacing a “+” with a “-”, or removing a function’s body. However, certain type...
We study the problem of finding incorrect property accesses in JavaScript where objects do not have a fixed layout, and properties (including methods) can be added, overwritten, and deleted freely throughout the lifetime of an object. Since referencing a non-existent property is not an error in JavaScript, accidental accesses to non-existent proper...
Unit tests play a key role in ensuring the correctness of software. However, manually creating unit tests is a laborious task, motivating the need for automation. Large Language Models (LLMs) have recently been applied to this problem, utilizing additional training or few-shot learning on examples of existing tests. This paper presents a large-scal...
Unit tests play a key role in ensuring the correctness of software. However, manually creating unit tests is a laborious task, motivating the need for automation. Large Language Models (LLMs) have recently been applied to various aspects of software development, including their suggested use for automated generation of unit tests, but while requiri...
JavaScript is an increasingly popular language for server-side development, thanks in part to the Node.js runtime environment and its vast ecosystem of modules. With the Node.js package manager npm, users are able to easily include external modules as dependencies in their projects. However, npm installs modules with all of their functionality, eve...
Event-driven programming is widely practiced in the JavaScript community, both on the client side to handle UI events and AJAX requests, and on the server side to accommodate long-running operations such as file or network I/O. Many popular event-based APIs allow event names to be specified as free-form strings without any validation, potentially l...
JavaScript is an increasingly popular language for server-side development, thanks in part to the Node.js runtime environment and its vast ecosystem of modules. With the Node.js package manager npm, users are able to easily include external modules as dependencies in their projects. However, npm installs modules with all of their functionality, eve...
Event-driven programming is widely practiced in the JavaScript community, both on the client side to handle UI events and AJAX requests, and on the server side to accommodate long-running operations such as file or network I/O. Many popular event-based APIs allow event names to be specified as free-form strings without any validation, potentially l...
Call graphs have many applications in software engineering, including bug-finding, security analysis, and code navigation in IDEs. However, the construction of call graphs requires significant investment in program analysis infrastructure. An increasing number of programming languages compile to the Java Virtual Machine (JVM), and program analysis...
Event-driven programming is widely used for implementing user interfaces, web applications, and non-blocking I/O. An event-driven program is organized as a collection of event handlers whose execution is triggered by events. Traditional static analysis techniques are unable to reason precisely about event-driven code because they conservatively ass...
Previous approaches to dynamic taint analysis for JavaScript are implemented directly in a browser or JavaScript engine, limiting their applicability to a single platform and requiring ongoing maintenance as platforms evolve, or they require nontrivial program transformations. We present an approach that relies on instrumentation to encode taint pr...
Asynchronous client-server communication is a common source of errors in JavaScript web applications. Such errors are difficult to detect using ordinary testing because of the nondeterministic scheduling of AJAX events. Existing automated event race detectors are generally too imprecise or too inefficient to be practically useful. To address this p...
Test generation has proven to provide an effective way of identifying programming errors. Unfortunately, current test generation techniques are challenged by higher-order functions in dynamic languages, such as JavaScript functions that receive callbacks. In particular, existing test generators suffer from the unavailability of statically known typ...
Recently, promises were added to ECMAScript 6, the JavaScript standard, in order to provide better support for the asynchrony that arises in user interfaces, network communication, and non-blocking I/O. Using promises, programmers can avoid common pitfalls of event-driven programming such as event races and the deeply nested counterintuitive contro...
Event races are a common source of subtle errors in JavaScript web applications. Several automated tools for detecting event races have been developed, but experiments show that their accuracy is generally quite low. We present a new approach that focuses on three categories of event race errors that often appear during the initialization phase of...
In JavaScript programs, asynchrony arises in situations such as web-based user-interfaces, communicating with servers through HTTP requests, and non-blocking I/O. Event-based programming is the most popular approach for managing asynchrony, but suffers from problems such as lost events and event races, and results in code that is hard to understand...
We present a type system and inference algorithm for a rich subset of JavaScript equipped with objects, structural subtyping, prototype inheritance, and first-class methods. The type system supports abstract and recursive objects, and is expressive enough to accommodate several standard benchmarks with only minor workarounds. The invariants enforce...
We present a type system and inference algorithm for a rich subset of JavaScript equipped with objects, structural subtyping, prototype inheritance, and first-class methods. The type system supports abstract and recursive objects, and is expressive enough to accommodate several standard benchmarks with only minor workarounds. The invariants enforce...
We present a type system and inference algorithm for a rich subset of JavaScript equipped with objects, structural subtyping, prototype inheritance, and first-class methods. The type system supports abstract and recursive objects, and is expressive enough to accommodate several standard benchmarks with only minor workarounds. The invariants enforce...
Recent years have seen growing interest in the retrofitting of type systems onto dynamically-typed programming languages, in order to improve type safety, programmer productivity, or performance. In such cases, type system developers must strike a delicate balance between disallowing certain coding patterns to keep the type system simple, or includ...
Many bugs in JavaScript applications manifest themselves as objects that have incorrect property values when a failure occurs. For this type of error, stack traces and log files are often insufficient for diagnosing problems. In such cases, it is helpful for developers to know the control flow path from the creation of an object to a crashing state...
Recent years have seen growing interest in the retrofitting of type systems onto dynamically-typed programming languages, in order to improve type safety, programmer productivity, or performance. In such cases, type system developers must strike a delicate balance between disallowing certain coding patterns to keep the type system simple, or includ...
Call graphs have many applications in software engineering. For example, they serve as the basis for code navigation features in integrated development environments and are at the foundation of static analyses performed in verification tools. While many call graph construction algorithms have been presented in the literature, we are not aware of an...
Many JavaScript programs are written in an event-driven style. In particular, in server-side Node. js applications, operations involving sockets, streams, and files are typically performed in an asynchronous manner, where the execution of listeners is triggered by events. Several types of programming errors are specific to such event-based programs...
When two methods are invoked on the same object, the dispatch behaviours of these method calls will be correlated. If two correlated method calls are polymorphic (i.e., they dispatch to different method definitions depending on the type of the receiver object), a program’s interprocedural control-flow graph will contain infeasible paths. Existing a...
Disclosed is a novel computer implemented system, on demand service, computer program product and a method that leverages combined concrete and symbolic execution and several fault-localization techniques to automatically detects failures and localizes faults in PHP Hypertext Preprocessor (“PHP”) Web applications.
The present invention provides a system, computer program product and a computer implemented method for prioritizing code fragments based on the use of a software oracle and on a correlation between the executed code fragments and the output they produce. Also described is a computer-implemented method generates additional user inputs based on exec...
Disclosed is a novel computer implemented system, on demand service, computer program product and a method that provides a set of lock usages that improves concurrency resulting in execution performance of the software application by reducing lock contention through refactoring. More specifically, disclosed is a method to refactor a software applic...
Experts aim at creating contract between PLDI organizers and the broader PLDI community that defines essential organizational and reviewing policies. They wish to establish clear expectations for authors while allowing plenty of leeway for organizers to innovate. Each topic has two subsections, such as Prescriptions and Suggestions. Prescriptions a...
As Scala gains popularity, there is growing interest in programming tools for it. Such tools often require call graphs. However, call graph construction algorithms in the literature do not handle Scala features, such as traits and abstract type members. Applying existing call graph construction algorithms to the JVM bytecodes generated by the Scala...
A novel system, computer program product, and method are disclosed for transforming a program to facilitate points-to analysis. The method begins with accessing at least a portion of program code, such as JavaScript. In one example, a method with at least one dynamic property correlation is identified for extraction. When a method m is identified f...
Automated refactorings as implemented in modern IDEs for Java usually make no special provisions for concurrent code. Thus, refactored programs may exhibit unexpected new concurrent behaviors. We analyze the types of such behavioral changes caused by current refactoring engines and develop techniques to make them behavior-preserving, ranging from s...
The present invention provides a system, computer program product and a computer implemented method for prioritizing code fragments based on the use of a software oracle and on a correlation between the executed code fragments and the output they produce. Also described is a computer-implemented method generates additional user inputs based on exec...
The present invention provides a programming model based on a relational view of the heap which defines identity declaratively, obviating the need for equals( ) and hashcode( ) methods. Each element in the heap (called a tuple) belongs to a relation type and relates an immutable identity to mutable state. The model entails a stricter contract: iden...
The present invention provides a system, computer program product, and a computer implemented method for analyzing a set of two or more communicating applications. The method includes executing a first application, such as a client application, and executing a second application, such as a server application. The applications are communicating with...
We present an analysis for identifying determinate variables and expressions that always have the same value at a given program point. This information can be exploited by client analyses and tools to, e.g., identify dead code or specialize uses of dynamic language constructs such as eval, replacing them with equivalent static constructs. Our analy...
The present invention provides a system, computer program product, and a computer implemented method for analyzing a set of two or more communicating applications. The method begins with receiving a first second application that communicates with each other during execution. Next, an initial input for executing the first application and the second...
Previously, we developed a data-centric approach to concurrency control in which programmers specify synchronization constraints declaratively, by grouping shared locations into atomic sets. We implemented our ideas in a Java extension called AJ, using Java locks to implement synchronization. We proved that atomicity violations are prevented by con...
The rapid rise of JavaScript as one of the most popular programming languages of the present day has led to a demand for sophisticated IDE support similar to what is available for Java or C#. However, advanced tooling is hampered by the dynamic nature of the language, which makes any form of static analysis very difficult. We single out efficient c...
A system and method for ensuring consistency of data and preventing data races, including steps of: receiving and examining a computer program written in an object-oriented language; receiving sequences of accesses that form logical operations on a set of memory locations used by the program; receiving definitions of atomic sets of data from the me...
Disclosed is a novel computer implemented system, on demand service, computer program product and a method for fault-localization techniques that apply statistical analyses to execution data gathered from multiple tests. The present invention determines the fault-localization effectiveness of test suites generated according to several test-generati...
JavaScript poses significant challenges for points-to analysis, particularly due to its flexible object model in which object properties can be created and deleted at run-time and accessed via first-class names. These features cause an increase in the worst-case running time of field-sensitive Andersen-style analysis, which becomes O(N
4), where N...
PHP web applications routinely generate invalid HTML. Modern browsers silently correct HTML errors, but sometimes malformed pages render inconsistently, cause browser crashes, or expose security vulnerabilities. Fixing errors in generated pages is usually straightforward, but repairing the generating PHP program can be much harder. We observe that...
In recent years, there has been significant interest in fault-localization techniques that are based on statistical analysis of program constructs executed by passing and failing executions. This paper shows how the Tarantula, Ochiai, and Jaccard fault-localization algorithms can be enhanced to localize faults effectively in Web applications writte...
JavaScript is one of the most widely used programming languages of the present day. While its flexibility is treasured by proponents, its lack of language support for encapsulation is an obstacle to writing maintainable programs. We propose refactorings for improving modularity, and discuss challenges arising in their implementation.
Refactoring is a popular technique for improving the structure of existing programs while maintaining their behavior. For statically typed programming languages such as Java, a wide variety of refactorings have been described, and tool support for performing refactorings and ensuring their correctness is widely available in modern IDEs. For the Jav...
Various coverage criteria are commonly used to assess the quality of test suites, but achieving full coverage according to these criteria is often impossible or impractical. Our research starts from the pop-ular assumption that a disproportionate number of faults is likely to reside in recently changed code. Based on this assumption, we propose sev...
Recent versions of the Java standard library offer flexible locking constructs that go beyond the language's built-in monitor locks in terms of features, and that can be fine-tuned to suit specific application scenarios. Under certain conditions, the use of these constructs can improve performance significantly, by reducing lock contention. However...
Current practice in testing JavaScript web applications requires manual construction of test cases, which is difficult and tedious. We present a framework for feedback-directed automated test generation for JavaScript in which execution is monitored to collect information that directs the test generator towards inputs that yield increased coverage....
Type constraints express subtype relationships between the types of program expressions, for example, those relationships that are required for type correctness. Type constraints were originally proposed as a convenient framework for solving type checking and type inference problems. This paper shows how type constraints can be used as the basis fo...
Concurrency-related errors, such as data races, are frustratingly difficult to track down and eliminate in large object-oriented programs. Traditional approaches to preventing data races rely on protecting instruction sequences with synchronization operations. Such control-centric approaches are inherently brittle, as the burden is on the programme...
Fault-localization techniques that apply statistical analyses to execution data gathered from multiple tests are quite effective when a large test suite is available. However, if no test suite is available, what is the best approach to generate one? This paper investigates the fault-localization effectiveness of test suites generated according to s...
Web script crashes and malformed dynamically-generated web pages are common errors, and they seriously impact the usability of web applications. Current tools for web-page validation cannot handle the dynamically generated pages that are ubiquitous on today's Internet. We present a dynamic test generation technique for the domain of dynamic web app...
Automated refactorings as implemented in modern IDEs for Java usually make no special provisions for concurrent code. Thus,
refactored programs may exhibit unexpected new concurrent behaviors. We analyze the types of such behavioral changes caused
by current refactoring engines and develop techniques to make them behavior-preserving, ranging from s...
Data-centric synchronization groups fields of objects into atomic sets to indicate that they must be updated atomically. Each atomic set has a number of associated units of work, code fragments that preserve the consistency of that atomic set. This paper presents a type system for data-centric synchronization that enables separate compilation and s...
We leverage combined concrete and symbolic execution and several fault-localization techniques to create a uniquely powerful tool for localizing faults in PHP applications. The tool automatically generates tests that expose failures, and then automatically localizes the faults responsible for those failures, thus overcoming the limitation of previo...
A program is reentrant if distinct executions of that program on distinct inputs cannot affect each other. Reentrant programs have the desirable property that they can be deployed on parallel machines without additional concurrency control. Many existing Java programs are not reentrant because they rely on mutable global state. We present a mostly-...
Software development teams exchange source code in shared repositories. These repositories are kept consistent by having developers follow a commit policy, such as "Pro- gram edits can be committed only if all available tests suc- ceed." Such policies may result in long intervals between commits, increasing the likelihood of duplicative develop- me...
Abstract Developers use unit testing to improve the quality of soft- ware systems. Current development,tools for unit testing help with automating test execution, with reporting results, and with generating test stubs. However, they offer no aid for designing tests aimed specifically at exercising the ef- fects of changes to a program. This paper d...
There is a disconnect between modelling and implementation: relationships are prevalent in system models but implementation languages do not provide first-class support for them. For example, in Java (and other Object-Oriented Languages), relationships must be implemented by hand using references embedded in participants. This approach is cumbersom...
Previously we presented atomic sets, memory locations that share some consistency property, and units of work, code fragments that preserve consistency of atomic sets on which they are declared. We also proposed atomic-set serializabil- ity as a correctness criterion for concurrent programs, stating that units of work must be serializable for each...
Web script crashes and malformed dynamically-generated Web pages are common errors, and they seriously impact usability of Web applications. Current tools for Web-page validation cannot handle the dynamically-generated pages that are ubiquitous on to- day's Internet. In this work, we apply a dynamic test generation technique, based on combined conc...
ABSTRACT We present an approach for checking code against rich spec- iflcations, based on existing work that consists of encod- ing the program in a relational logic and using a constraint solver to flnd speciflcation violations. We improve the e‐- ciency of this approach with a new encoding of the program that efiectively slices it at the logical...
Type constraints express subtype-relationships between the types of program expressions that are required for type-correctness,
and were originally proposed as a convenient framework for solving type checking and type inference problems. In this paper,
we show how type constraints can be used as the basis for practical refactoring tools. In our app...
Object-oriented languages deflne the identity of an object to be an address-based object identifler. The programmer may customize the notion of object identity by overriding the equals() and hashCode() methods following a specifled contract. This customization often intro- duces latent errors, since the contract is unenforced and at times impos- si...
WRT’07 was the first instance of the Workshop on Refactoring Tools. It was held in Berlin, Germany, on July 31st, in conjunction
with ECOOP’07. The workshop brought together over 50 participants from both academia and industry. Participants include the
lead developers of two widely used refactoring engines (Eclipse and NetBeans), researchers that w...
WRT'07 was the first instance of the Workshop on Refactoring Tools. It was held in Berlin, Germany, on July 31st, in conjunction with ECOOP'07. The workshop brought together over 50 participants from both academia and industry. Participants include the lead developers of two widely used refactoring engines (Eclipse and NetBeans), researchers that w...
Crisp is an Eclipse plug-in tool for constructing intermediate versions of a Java program that is being edited. After a long editing session, a programmer will run regression tests to make sure she has not invalidated previously tested functionality. If a test fails unexpectedly, Crisp allows the programmer to select parts of the edit that affected...
Type safety and expressiveness of many existing Java libraries and their client applications would improve, if the libraries were upgraded to define generic classes. Ef- ficient and accurate tools exist to assist client applications to use generic libraries, but so far the libraries themselves must be parameterized manually, which is a tedious, tim...
Wir prasentieren eine operationelle Semantik mit Typsicherheitsbeweis f¨ ur Mehr- fachvererbung in C++, formalisiert im und maschinengepruft durch den Maschinen- beweiser Isabelle/HOL. Die Typsicherheit des Vererbungsmechanismus von C++ war lange offen. Der nun vorliegende Beweis erh¨ oht das Vertrauen in die Sprache, er- zeugt aber auch neue Einsi...
Testing and code editing are interleaved activities during program development. When tests fail unexpectedly, the changes that caused the failure(s) are not always easy to find. We explore how change classification can focus programmer attention on failure-inducing changes by automatically labeling changes Red, Yellow, or Green, indicating the like...
We present an operational semantics and type safety proof for multiple inheritance in C++. The semantics models the behaviour of method calls, field accesses, and two forms of casts in C++ class hierarchies exactly, and the type safety proof was formalized and machine-checked in Isabelle/HOL. Our semantics enables one, for the first time, to unders...
The ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM 2006) was held on January 9th and 10th in Charleston, South Carolina and this article contains abstracts for the 19 papers presented at the workship. The PEPM workshops focus on techniques, supporting theory, tools, and applications of the analysis and manipulation of pro...
Program slicing is a useful technique for debugging, testing, and analyzing programs. A program slice consists of the parts of a program which (potentially) affect the values computed at some point of interest. With rare exceptions, program slices have hitherto been computed and defined in ad-hoc and language-specific ways. The principal contributi...
Concurrency-related bugs may happen when multiple threads ac- cess shared data and interleave in ways that do not correspond to any sequential execution. Their absence is not guaranteed by the traditional notion of "data race" freedom. We present a new defini- tion of data races in terms of 11 problematic interleaving scenarios, and prove that it i...
We present refactorings that automate the process of migrating pre-generics Java programs to use generics. The task is divided in two parts: introduction of formal type parameters (parameterization) and inference of actual type parameters (instantiation). We developed efficient techniques and tools to assist developers in both parts. We will presen...
As object-oriented class libraries evolve, classes are occasionally deprecated in favor of others with roughly the same functionality. In Java's standard libraries, for example, class Hashtable has been superseded by HashMap, and Iterator is now preferred over Enumeration. Migrating client applications to use the new idioms is often desirable, but...
Java 1.5 generics enable the creation of reusable container classes with compiler-enforced type-safe usage. This eliminates the need for potentially un- safe down-casts when retrieving elements from containers. We present a refac- toring that replaces raw references to generic library classes with parameterized references. The refactoring infers ac...
Chianti is a change impact analysis tool for Java that is implemented in the context of the eclipse environment. Chianti analyzes two versions of a Java program, decomposes their difference into a set of atomic changes, and a partial order inter-dependences of these changes is calculated. Change impact is then reported in terms of affected (regress...
We present, for the first time, an operational semantics and a type system for a C++-like object-oriented language with both shared and repeated multiple inheritance, together with a machine- checked proof of type safety. The formalization uncovered several subtle ambiguities in C++, which C++ compilers resolve by ad-hoc means or which even result...
Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g. , pa...
This paper reports on the design and implementation of Chianti, a change impact analysis tool for Java that is implemented in the context of the Eclipse environment. Chianti analyzes two versions of an application and decomposes their difference into a set of atomic changes. Change impact is then reported in terms of affected (regression or unit) t...
We will demonstrate several advanced refactorings for Java that have been implemented in the context of the Eclipse development environment for Java (see www.eclipse.org). These refactorings are semantics-preserving program transformations that are typical of the sorts of transformations object-oriented programmers perform manually in order to impr...
This paper reports on the design and implementation of Chianti, a change impact analysis tool for Java that is implemented in the context of the Eclipse environment. Chianti analyzes two versions of an application and decomposes their difference into a set of atomic changes. Change impact is then reported in terms of affected (regression or unit) t...
The use of class libraries increases programmer productivity by allowing programmers to focus on the functionality unique to their application. However, library classes are generally designed with some typical usage pattern in mind, and performance may be suboptimal if the actual usage differs. We present an approach for rewriting applications to u...
Version 1.5 of the Java programming language will include gener-ics, a language construct for associating type parameters with classes and methods. Generics are particularly useful for creating statically type-safe, reusable container classes such that a store of an inappro-priate type causes a compile-time error, and that no down-casting is needed...
Refactoring is the process of applying behavior-preserving transformations (called "refactorings") in order to improve a program's design. Associated with a refactoring is a set of preconditions that must be satisfied to guarantee that program behavior is preserved, and a set of source code modifications. An important category of refactorings is co...
The potential of application extraction techniques to reduce the size of library-based Java applications by constructing an application extractor, Jax, is discussed. Extraction techniques incorporated in Jax include the removal of unreachable methods and redundant methods, and compaction of the class hierarchy. Extraction techniques are effective f...
Reducing application size is important for software that is distributed via the internet, in order to keep download times manageable, and in the domain of embedded systems, where applications are often stored in (Read-Only or Flash) memory. This paper explores extraction techniques such as the removal of unreachable methods and redundant fields, in...






























































































