diff --git a/GSOC.html b/GSOC.html --- a/GSOC.html +++ b/GSOC.html @@ -58,7 +58,7 @@

-

Feel free to copy and use our Template GSOC Proposal

+

Feel free to copy and use our Template GSOC Proposal

Engage with your mentor about your project
diff --git a/OpenProjects.html b/OpenProjects.html --- a/OpenProjects.html +++ b/OpenProjects.html @@ -34,158 +34,8 @@ - - - -
  • - Google Summer of Code 2021 -
  • -
  • - Google Summer of Code 2020 - - -
  • -
  • Google Summer of Code 2019 - -
  • -
  • Google Summer of Code 2018
  • -
  • Google Summer of Code 2017
  • What is this?
  • LLVM Subprojects: Clang and more
  • @@ -243,21 +93,7 @@

    We encourage you to look through this list and see which projects excite you and match well with your skill set. We also invite proposals not on this - list. More information and discussion about GSoC can be found in - - discourse - . If you have questions about a particular project please find the - relevant entry in discourse, check previous discussion and ask. If there is - no such entry or you would like to propose an idea please create a new - entry. Feedback from the community is a requirement for your proposal to be - considered and hopefully accepted. -

    - -

    The LLVM project has participated in Google Summer of Code for several years - and has had some very successful projects. We hope that this year is no - different and look forward to hearing your proposals. For information on how - to submit a proposal, please visit the Google Summer of Code main - website. + list. For full details on how to apply and required steps, please see our Google Summer of Code page.

    @@ -823,1629 +659,6 @@

    - -
    - Google Summer of Code 2021 -
    - - -
    -

    - Welcome prospective Google Summer of Code 2021 Students! This document is your - starting point to finding interesting and important projects for LLVM, Clang, - and other related sub-projects. This list of projects is not only developed for - Google Summer of Code, but open projects that really need developers to work on - and are very beneficial for the LLVM community.

    - -

    We encourage you to look through this list and see which projects excite you - and match well with your skill set. We also invite proposals not on this - list. You must propose your idea to the LLVM community through our - developers' mailing list (llvm-dev@lists.llvm.org or specific subproject mailing - list). Feedback from the community is a requirement for your proposal to be - considered and hopefully accepted. -

    - -

    The LLVM project has participated in Google Summer of Code for several years - and has had some very successful projects. We hope that this year is no - different and look forward to hearing your proposals. For information on how to - submit a proposal, please visit the Google Summer of Code - main website.

    -
    - - -
    - LLVM -
    - - - -
    - Distributed lit testing -
    - - -
    -

    Description of the project: - The LLVM lit test suites consist of thousands of small independent tests. - Due to the number of tests, it can take a long time to run the full suite, - even on a high-spec computer. Builds are already distributable across - multiple computers available on the same network, using software such as - distcc or icecream, so running tests on a single machine becomes a potential - bottleneck. One way to speed up running of the tests could be to distribute - test execution across many computers too. Lit provides a test sharding - mechanism, which allows multiple computers to run parts of the same - testsuite in tandem, but this currently assumes access to a single common - filesystem, which may not be possible in all cases and a knowledge of which - machines the suite can currently be run on. - - This project’s goal is to update the existing lit harness (or write a - wrapper around it) to allow distribution of the tests in this way, with the - idea that developers can write their own interface between the harness and - the distribution system of their choice. This harness may need to be able to - identify test dependencies such as input files and executables, send the - tests to the distribution system (possibly in batches), and receive, collate - and report the results to the user, in a similar manner to how lit already - does. -

    - -

    Expected results: An easy to use harness as described above. Some - evidence that given a distributed system, a user can expect to see test - suite execution to speed up if they are using that harness.

    - -

    Confirmed mentor: James Henderson

    -

    Desirable skills: Good knowledge of Python. Familiarity with LLVM - lit testing. Some knowledge of distribution systems would also be - beneficial.

    -
    - - -
    - Learning Loop Transformation Heuristics -
    - - -
    -

    Description of the project: - This is a short description, please reach out to Johannes (jdoerfert on - IRC) and Mircea Trofin if it sounds interesting. - - We successfully introduced an ML framework for inliner decisions, now we want - to expand the scope. In this project we will look at loop transformation - heuristics, such as the unroll factor. As a motivational example we can look - at a small trip count dgemm which - we optimize pretty poorly. With the nounroll pragmas we do a better job but - still not close to gcc. - - The project is open-ended and we could look at various passes/heuristics - concurrently. -

    - -

    Preparation resources: The ML inliner framework in the LLVM code - base as well as the paper. LLVM - transform passes (that are based on heuristics), e.g., loop unroll.

    - -

    Expected results: Measurable better performance with a learned - predictor, potentially a set of "classical" heuristics derived from the ML - model.

    - -

    Confirmed Mentor: Johannes Doerfert, Mircea Trofin

    -

    Desirable skills: Intermediate knowledge of ML, C++, self motivation.

    -
    - - -
    - Fuzzing LLVM-IR Passes -
    - - -
    -

    Description of the project: - This is a short description, please reach out to Johannes (jdoerfert on - IRC) if it sounds interesting. - - Fuzzing often reveals a myriad of bugs. CSmith (and others) showed how to do - this with C-like languages and we have used LLVM-IR fuzzing in - the past successfully. In this project we will apply fuzzing to new passes - that are in development, e.g., the Attributor pass. We want to find and fix - crashes but also other bugs, including compile time performance problems. -

    - -

    Preparation resources: The LLVM fuzzer - infrastructure. LLVM passes that we might want to fuzz, e.g. the - Attributor pass. Prior IR-Fuzzing work - (https://www.youtube.com/watch?v=UBbQ_s6hNgg)

    - - -

    Expected results: Crashes, maybe also a way to catch non-crash - bugs, including performance problems.

    - -

    Confirmed Mentor: Johannes Doerfert

    -

    Desirable skills: Intermediate knowledge C++, self motivation.

    -
    - - -
    - llvm.assume the missing pieces -
    - - -
    -

    Description of the project: - This is a short description, please reach out to Johannes (jdoerfert on - IRC) if it sounds interesting. - - llvm.assume is a powerful mechanism to retain knowledge. Since it - inception it was improved already multiple times but there are major - extensions still outstanding which we want to tackled in this project. - An incomplete list of topics includes: -

    - -

    - -

    Preparation resources: The llvm.assumption usage, the assumption - cache, the "enable-knowledge-retention" option, the RFC - and this review. -

    - - -

    Expected results: New llvm.assume use cases, improved performance through knowledge retention, optimization based on assertions.

    - -

    Confirmed Mentor: Johannes Doerfert

    -

    Desirable skills: Intermediate knowledge C++, self motivation.

    -
    - - -
    - Fix fundamental issues in LLVM's IR -
    - - -
    -

    Description of the project: - LLVM's IR has fundamental, long-standing issues. Many are related with - undefined behaviors. Others are simply a fallout from underspecification - and different interpretations by diffferent people. - Alive2 is a tool that - detects bugs in LLVM's optimizations automatically. Using Alive2, we track - bugs exposed by the unit tests on a - dashboard. -

    - -

    Expected results: - 1) Report and fix bugs detected by Alive2. - 2) Pick one fundamental IR issue and - make progress towards fixing it, including proposing fixes for the - semantics, testing - fixes to the semantics by running Alive2 over the LLVM unit tests and - medium-sized programs, test performance of semantic fixes and fix - performance regressions. -

    -

    Confirmed Mentor: Nuno Lopes, Juneyoung Lee

    -

    Desirable skills: Intermediate C++; willingness to learn about LLVM - IR semantics; experience reading papers (preferred). -

    -
    - - -
    - Utilize LoopNest Pass -
    - - -
    -

    Description of the project: - The idea of LoopNest pass is recently added, and there are no existing - passes utilizing it. Before having LoopNest pass, if you want to write a - pass that works on a loop nest, you have to pick from either a function - pass or a loop pass. If you chose to write it as a function pass, then you - lose the ability to add loops dynamically back to the pipeline. If you - decide to write it as a loop pass, then you are wasting compile time to - traverse to your pass and return right away when the given loop is not the - outermost loop. In this project, we want to utilize the recently introduced - LoopNest pass for passes intended for loop nest and have the same ability - as the LoopPass to dynamically add loops to the pipeline. In addition, - improve the current implementation of LoopNestPass when necessary. -

    -

    Expected results (possibilities): - Utilize LoopNest Pass for some existing transformations/analyses. -

    -

    Confirmed Mentors: - Whitney Tsang, Ettore Tiotto -

    -

    Desirable skills: - Intermediate knowledge of C++, self-motivation. -

    -
    - - -
    - JIT-ing OpenMP GPU kernels transparently -
    - - -
    -

    Description of the project: - This is a short description, please reach out to Johannes (jdoerfert on - IRC) if it sounds interesting. - - OpenMP GPU kernels are usually lowered to native binaries, e.g., cubin, and - embedded into the host object. At runtime, OpenMP "plugins" will connect with - the device driver, e.g., CUDA, to load and run such embedded binary images. - In this project we want to develop a new plugin that takes LLVM-IR code, optimizes - the IR with kernel parameters known only at runtime, and then generates the GPU - binary for consumption by other plugins. Similar to the remote - offload plugin we can do this transparently to the user. In addition to the JIT - infrastructure setup in the plugin we will need to embed the IR into the host object. - -

    - -

    Preparation resources:OpenMP target offloading infrastructure, LLVM JIT infrastructure. -

    - - -

    Expected results: A JIT-capable offload plugin which can achieve superior performance when kernel specialization is enabling optimizations.

    - -

    Confirmed Mentor: Johannes Doerfert

    -

    Desirable skills: Intermediate knowledge C++, JIT compilation, self motivation.

    -
    - - -
    - OpenACC -
    - - - -
    - OpenACC Diagnostics from the OpenMP Runtime -
    - - -
    -

    Description of the project: - Clacc and Flacc are projects to introduce OpenACC support to Clang and - Flang. For that purpose, OpenACC runtime support is being developed on top - of LLVM's OpenMP runtime. However, diagnostics emitted by LLVM's OpenMP - runtime are expressed in terms of OpenMP concepts, and so those diagnostics - are not always meaningful to OpenACC users. This project should address - this issue in two steps: -

      -
    1. - Develop a mechanism that selects OpenACC versions of diagnostics that - are emitted as a result of OpenACC-related calls into the runtime. This - mechanism should be general enough that it could be used for programming - languages besides OpenMP and OpenACC. One possible approach is to - extend internationalization mechanisms already present in some - components of the OpenMP runtime. -
    2. -
    3. - Provide OpenACC translations for existing OpenMP diagnostics. This step - requires an understanding of the relationship between OpenACC and OpenMP - as implemented in Clacc and Flacc. -
    4. -
    - Many components of OpenACC support that will depend upon this project have - not yet been upstreamed and are under development. A high-level - understanding of those efforts is helpful for this project and can be - provided by the mentors. Nevertheless, this project can be completed in - upstream LLVM's OpenMP runtime now independently of those efforts. -

    - -

    - Expected results: A version of upstream LLVM's OpenMP runtime that - can emit OpenACC diagnostics as needed. -

    - -

    Confirmed Mentors: Valentin Clement, Joel E. Denny

    - -

    - Desirable skills: Intermediate C++; Experience with OpenACC or - OpenMP -

    -
    - - -
    - Polly -
    - - - -
    - Use official isl C++ bindings -
    - - -
    -

    Description of the project: - Polly use algorithms from the - Integer Set Library (isl), which is a - library written in C. It uses reference-counting for memory management. - Getting reference counting correct is much easier in C++ using RAII, - therefore we created a C++ binding for isl: - isl-noexceptions.h. - Since then, isl also gained two official C++ bindings, - cpp.h - and - cpp-checked.h. - - We would like to replace the Polly-maintained C++ bindings with the upstream - bindings. Unfortunately, this is not an in-place replacement. Differences - include how errors are checked, method names, which functions are - considered as operator/constructor overloads and the set of exported functions. - This will require changing Polly's uses of the C++ bindings and submitting - patches to isl to export additional functionality needed by Polly. -

    - -

    Expected results: - Reduce the differences between the Polly-maintained isl-noexceptions.h - bindings and one of the two C++ bindings that isl supports. Due to - isl-noexceptions.h exporting more functions and classes than the upstream - bindings do, a complete replacement will probably be out of reach, but - even reducing the differences will reduce the maintenance cost of Polly's - isl-noexceptions.h. -

    - -

    Confirmed mentor: Michael Kruse

    -

    Desirable skills: - Deep knowledge of C++, in particular RAII and move-semantics. Interest in API design. Ideally, you already wrote some library's header file. - Experience with the isl library would be nice, but can also be learned in the project. -

    -
    - - -
    - Enzyme -
    - - - -
    - Integrate custom derivatives of BLAS, Eigen, and similar routines into Enzyme -
    - - -
    -

    Description of the project: - Enzyme performs automatic differentiation - (in the calculus sense) of LLVM programs. This enables users to use Enzyme to - perform various algorithms such as back-propagation in ML - or scientific simulation on existing code for any language that lowers to LLVM. - - Enzyme does so by applying the chain rule to every instruction in every - function called by the original function to be differentiated. While functional, - this is not necessarily optimal for high-level matrix operations which may - have algebraic properties for faster derivative computation. - - Enzyme also has a mechanism for specifying a custom gradient for a given function. - If a custom derivative is available, Enzyme will use that rather than fallback - to implementing its own. - - Many programs use BLAS libraries to efficiently compute matrix and tensor - operations. This project would enable high-performance automatic differentiation - of BLAS and similar libraries (such as Eigen) by specifying custom derivative - rules for their operations. -

    - -

    Expected results: - Efficient differentiation of BLAS and Eigen codes by writing custom - derivative rules for matrix and tensor operations. -

    - -

    Confirmed mentor: William Moses, Johannes Doerfert

    -

    Desirable skills: - Good knowledge of C++, calculus, and linear algebra. Experience with BLAS, Eigen, - or Enzyme would be nice, but can also be learned in the project. -

    -
    - - -
    - Integrate Enzyme into Swift to provide high-performance differentiation in Swift -
    - - -
    -

    Description of the project: - Enzyme performs automatic differentiation - (in the calculus sense) of LLVM programs. This enables users to use Enzyme to - perform various algorithms such as back-propagation in ML - or scientific simulation on existing code for any language that lowers to LLVM. - - While this functions for any frontend that emits LLVM IR, it may be desirable - to have closer integration between Enzyme and the frontend for the sake of - passing additional information and creating a better user experience. - - Swift provides automatic differentiation through the use of specifying custom - derivative rules in the front-end. Enzyme could be integrated directly with - Swift, differentiating the eventual LLVM, but it would lose out on all this - additional information about custom derivatives. Moreover, calling into - Enzyme naiively would be without Type checking, fine AD-specific debug information, - or various other nice tools that Swift provides users of AD. - - This project would seek to integrate Enzyme and the Swift front end to provide - both a nice user-experience for swift programmers who want to use Enzyme - to enable high-performance automatic differentiation, and also to allow Enzyme - to take advantage of derivative-specific metadata already available in swift. -

    - -

    Expected results: - Creation of a custom type-checked linguistic construct in Swift for calling Enzyme. - Mechanisms for passing Swift's differentiation-specific metadata for use by Enzyme. -

    - -

    Confirmed mentor: William Moses, Vassil Vassilev

    -

    Desirable skills: - Good knowledge of C++ and Swift. Experience with Enzyme or automatic differentiation - would be nice, but can also be learned in the project. -

    -
    - - -
    - Differentiation of Fixed-Point Arithmetic -
    - - -
    -

    Description of the project: - Enzyme performs automatic differentiation - (in the calculus sense) of LLVM programs. This enables users to use Enzyme to - perform various algorithms such as back-propagation in ML - or scientific simulation on existing code for any language that lowers to LLVM. - - In a variety of fields, it is desirable to compute on fixed-point values - (e.g. integers) rather than floating point values. This avoid certain truncation - errors that may be critical to a given application. Moreover, particular pieces - of hardware may simply be more efficient on fixed point rather than floating - point values. - - This project would seek to extend Enzyme to support differentiation of not only - floating point base values, but also fixed point base values.. -

    - -

    Expected results: - Implementation of adjoints for LLVM fixed point intrinsics, requisite type analysis - rules, and integration into a front-end for an end-to-end test. -

    - -

    Confirmed mentor: William Moses

    -

    Desirable skills: - Good knowledge of C++, caclulus, and LLVM internals. Experience with Enzyme or - automatic differentiation would be nice, but can also be learned in the project. -

    -
    - - -
    - Integrate Enzyme into Rust to provide high-performance differentiation in Rust -
    - - -
    -

    Description of the project: - Enzyme performs automatic differentiation - (in the calculus sense) of LLVM programs. This enables users to use Enzyme to - perform various algorithms such as back-propagation in ML - or scientific simulation on existing code for any language that lowers to LLVM. - - While this functions for any frontend that emits LLVM IR, it may be desirable - to have closer integration between Enzyme and the frontend for the sake of - passing additional information and creating a better user experience. - - This project would seek to integrate Enzyme and the Rust front end to provide - a nice user-experience for Rust programmers who want to use Enzyme - to enable high-performance automatic differentiation. This also potentially - involves integration of LLVM plugin support/custom codegen into rustc. -

    - -

    Expected results: - Creation of a custom type-checked linguistic construct in Rust for calling Enzyme. - Mechanisms for parsing Rust's Type information (represented as debug LLVM debug - info) directly into type analysis. -

    - -

    Confirmed mentor: William Moses

    -

    Desirable skills: - Good knowledge of C++ and Rust. Experience with Enzyme or automatic differentiation - would be nice, but can also be learned in the project. -

    -
    - - -
    - Clang Static Analyzer performance profiling -
    - - -
    -

    Description of the project: -

    -

    - -

    Confirmed mentor: Artem Dergachev

    -
    - - -
    - Clang Static Analyzer constraint solver improvements -
    - - -
    -

    Description of the project: - CSA has a small in-house constraint solver, it is pretty trivial, but super - fast. The goal is to support range-based logic for some of the symbolic - operators, while keeping it linear. Additionally, a unit-test framework - can be designed specifically for testing constraint solvers (right now it’s - tested rather awkwardly). This project has a couple of interesting - properties. It can be segmented into small chunks, and each of these - chunks has a non-trivial solution. It might introduce you to a world of - solvers (it is a good idea to check your ideas with some heavy-weight - solver such as z3). And because the existing solver is simple, there is a - myriad of possible extensions to try. -

    - -

    Confirmed mentor: Valeriy Savchenko

    -
    - - -
    - A structured approach to diagnostics in LLDB -
    - - -
    -

    Description of the project: -

    -

    - -

    Confirmed mentor: - Jonas Devlieghere and Raphael Isemann

    -
    - - -
    - Google Summer of Code 2020 -
    - - -
    -

    - LLVM participation in Google Summer of Code 2020 was very successful and resulted - in many interesting projects contributed to LLVM. For the list of accepted and - completed projects, please take a look into Google Summer of Code - - website. -

    -
    - - -
    - LLVM -
    - - - -
    - Improve inter-procedural analyses and optimizations -
    - - -
    -

    Description of the project: - This is a short description, please reach out to Johannes (jdoerfert on IRC) - if it sounds interesting. - - During the GSoC'19 we build the Attributor framework to improve the - inter-procedural capabilities of LLVM. This is useful on its own but - especially in situations where inlining is impossible or undesirable. - - In this GSoC project we will look at capabilities not yet available in the - Attributor and for the potential to connect the Attributor with existing - intra- and inter-procedural optimizations. - - In this project there is a lot of freedom to determine the actual tasks but - we will provide a pool of smaller and medium sized tasks that can be chosen - from as well. -

    - -

    Preparation resources: The Attributor YouTube videos from the - LLVM Developers Meeting 2019 and the recording of the IPO panel from the same - meeting. The Attributor framework as well as other existing inter-procedural - analyses and optimizations in LLVM.

    - -

    Expected results: Measurable better IPO, especially visible in cases - where inlining is not an option or undesirable.

    - -

    Confirmed Mentor: Johannes Doerfert

    -

    Desirable skills: Intermediate knowledge of C++, self motivation.

    -
    - - -
    - Improve parallelism-aware analyses and optimizations -
    - - -
    -

    Description of the project: - This is a short description, please reach out to Johannes (jdoerfert on IRC) - if it sounds interesting. - - With the OpenMPOpt pass (under - review) we started to teach the LLVM optimization pipeline about - OpenMP parallelism encoded as OpenMP runtime calls. - - In this GSoC project we will look at capabilities not yet available in the - OpenMPOpt pass and for the potential to connect existing intra- and - inter-procedural optimizations, e.g. the Attributor. - - In this project there is a lot of freedom to determine the actual tasks but - we will provide a pool of smaller and medium sized tasks that can be chosen - from as well. -

    - -

    Preparation resources: The "Optimizing Indirections, using - abstractions without remorse" video on YouTube from the LLVM Developers - Meeting 2018. The paper "Compiler Optimizations for OpenMP" and "Compiler - Optimizations For Parallel Programs" both by J. Doerfert and H. Finkel (the - slides for these are potentially even more useful).

    - -

    Expected results: Measurable better performance or program analysis - results for parallel programs with a focus on OpenMP.

    - -

    Confirmed Mentor: Johannes Doerfert

    -

    Desirable skills: Intermediate knowledge of C++, self motivation.

    -
    - - -
    - Make LLVM passes debug info invariant -
    - - -
    -

    Description of the project: - Generating debug information is one of the fundamental tasks a compiler - typically fulfills. It is clear that executable generated code should not - depend on the presence of debug information. -

    - Unfortunately there are known cases in LLVM were code generation differs - depending on whether debug information is enabled (`-g`) or not. These kind - of bugs can lead to bad debug experience ranging from unexpected execution - behaviour to the point of programs running fine in debug mode while crashing - without debug information. -

    - The issue has likely not a single cause but is triggered during different - passes on different architectures. One such reason is the insertion of Call - Frame Information (CFI) in the compiler backend during frame lowering and - other later passes. The presence of CFI instructions seems to change - instruction scheduling which therefore leads to different generated code. -

    - -

    Preparation resources: -

    -

    -

    Expected results: -

    -

    - -

    Confirmed Mentors: Paul Robinson and David Tellenbach

    - -

    Desirable skills: - Intermediate knowledge of C++, some familarity with general computer - architecture, some familarity with the x86 or Arm/AArch64 instruction set. -

    -
    - - - -
    - Improve MergeFunctions to incorporate MergeSimilarFunction patches and ThinLTO Support -
    - - -
    -

    Description of the project: MergeSimilarFunctions pass is able to - merge not just identical functions, but also functions with small differences in - their instructions to reduce code size. It does this by inserting control flow - and an additional argument in the merged function to account for the - differences. - - This work was presented at - the LLVM Dev Meeting in - 2013 A more detailed description was published in a paper at - LCTES 2014. The code - was released to the community at the time. Meanwhile, the pass has been in - production use at QuIC for the past few years and has been actively - maintained internally. In order to magnify the impact of - MergeSimilarFunctions, it has been ported to ThinLTO and the patches have - been upstreamed (see stack of 5 patches mentioned below). But instead of - replacing the existing MergeFunctions pass in LLVM-upstream the community - suggested we improve the existing one with the ideas from - MergeSimilarFunctions. And then leverage the ThinLTO on top of that. The - MergeSimilarFunction used in ThinLTO gives impressive code size reduction - across a wide range of workloads and the work was presented at - LLVM-dev - 2018. The LLVM project would greatly benefit from this code size - optimization as most embedded systems (think SmartPhones) applications are - constrained on code-size. -

    -

    Preparation resources: -

    -

    -

    Expected results: -

    -

    - -

    Confirmed Mentors:Aditya Kumar (hiraditya on IRC and phabricator), JF Bastien (jfb on phabricator)

    - -

    Desirable skills: - Course on compiler design, SSA Representation, - Intermediate knowledge of C++, Familiarity with LLVM Core. -

    -
    - - -
    - Add DWARF support to yaml2obj -
    - - -
    -

    Description of the project: - LLVM provides a tool called yaml2obj which coverts a YAML document into an - object file, for various different file formats such as ELF, COFF and - Mach-O, along with obj2yaml which does the inverse. The tool is commonly - used to test parts of LLVM, as YAML is often easier to use to describe an - object file than raw assembly and more maintainable than a pre-built binary. - DWARF is a debugging file format commonly used by LLVM. Many of the tests - for LLVM’s DWARF emission are written in assembly, but it would be nicer to - write them in YAML. However, yaml2obj does not properly support emission of - DWARF sections. This project is to add functionality to yaml2obj to make - writing test inputs for DWARF tests simpler, particularly for ELF objects. -

    - -

    Preparation resources: - Reading up on the DWARF file format will be useful, in particular the - standards available at http://dwarfstd.org/Download.php. Also, familiarising - yourself with the basics of the ELF file format, as described here - https://www.sco.com/developers/gabi/latest/contents.html, may be beneficial. -

    -

    Expected results: - The ability to use yaml2obj to generate DWARF sections for object files. - Particularly important is ensuring the input YAML can be more easily - understood than the equivalent assembly. -

    - -

    Confirmed Mentors: James Henderson

    - -

    Desirable skills: - Intermediate knowledge of C++. -

    -
    - - - -
    - Improve hot cold splitting to aggressively outline small blocks -
    - - -
    -

    Description of the project: Hot Cold Splitting in LLVM is an IR level - function splitting transformation. The goal of hot/cold splitting is to improve - the memory locality of code and helps reduce startup working set. The splitting pass - does this by identifying cold blocks and moving them into separate functions. Because it - is implemented at the IR level all the back end target benefit from it. - - It is a relatively new optimization and it was recently presented at - the LLVM Dev Meeting in - 2019 and the slides are here - Because most of the benefit comes from outlining small blocks e.g., __assert_rtn. The goal of this project - is to identify potential blocks via static analysis e.g., exception handling code, optimizing personality functions. - - Use cost-model to ensure outlining reduces the code size of the caller, use tail call whenever appropriate to save - instructions. - -

    -

    Preparation resources: -

    -

    -

    Expected results: -

    -

    - -

    Confirmed Mentors:Aditya Kumar (hiraditya on IRC and phabricator)

    - -

    Desirable skills: - Course on compiler design, SSA Representation, - Intermediate knowledge of C++, Familiarity with LLVM Core. -

    -
    - - - -
    - Advanced Heuristics for Ordering Compiler Optimization Passes -
    - - -
    -

    Description of the project: -Selecting optimization passes for given application is very important but -non-trivial problem because of the huge size of the compiler transformation -space (incl. pass ordering). While the existing heuristics can provide high -performance code for certain applications, they cannot easily benefit a wide -range of application codes. The goal of the project is to learn the interplay -between LLVM transformation passes and code structures, then improve the -existing heuristics (or replace the heuristics with machine learning-based -models) so that the LLVM compiler can provide a superior order of the passes -customized per application. -

    -

    Expected results (possibilities): -

    -

    - -

    Preparation resources: -

    -

    - -

    Confirmed Mentors:EJ Park, Giorgis Georgakoudis, Johannes Doerfert

    - -

    Desirable skills: - C++, Python, experience with LLVM and learning-based prediction preferable. -

    -
    - - - -
    - Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations -
    - - -
    -

    Description of the project: -Current machine learning models for compiler optimization select the best -optimization strategies for functions based on isolated per function analysis. -In this approach, the constructed models are not aware of any relationships -with other functions around it (callers or callees) which can be helpful to -decide the best optimization strategies for each function. In this project, we -want to explore the SCC (Strongly Connected Components) call graph to add -inter-procedural features in constructing machine learning-based models to find -the best optimization strategies per function. Moreover, we want to explore the -case that it is helpful to group strongly related functions together and -optimize them as a group, instead of per function. -

    -

    Expected results (possibilities): -

    -

    - -

    Preparation resources: -

    -

    - -

    Confirmed Mentors:EJ Park, Giorgis Georgakoudis, Johannes Doerfert

    - -

    Desirable skills: - C++, Python, experience with LLVM and learning-based prediction preferable. -

    -
    - - - -
    - -
    - - -
    -

    Description of the project: - There is currently no easy way to use the result of - PostDominatorTreeAnalysis in a loop pass, as PostDominatorTreeAnalysis is a - function analysis, and it is not included in LoopStandardAnalysisResults. If one adds - PostDominatorTreeAnalysis in LoopStandardAnalysisResults, then all loop passes - need to preserve it, meaning that all loop passes need to make sure the result is up to - date. In this project, we want to modify some commonly used utilities to generate a - list of updates, which can be consume by different updaters, e.g. DomTreeUpdater to - update DominatorTree and PostDominatorTree, and MSSAU to update MemorySSA, - etc, instead of only updating the DominatorTree. In additional, we want to change - existing loop passes to preserve the PostDominatorTree. Finally, adding - PostDominatorTree in LoopStandardAnalysisResults. -

    -

    Expected results (possibilities): - PostDominatorTree added in LoopStandardAnalysisResults, and - can be used by loop passes. More common utilities change to generate list of updates - to be easily obtained by different updaters. -

    -

    Confirmed Mentors: - Whitney Tsang, Ettore Tiotto, Bardia Mahjour -

    -

    Desirable skills: - Intermediate knowledge of C++, self-motivation. -

    -

    Preparation resources: - - - - -

    - - - -
    - Create LoopNest Pass -
    - - -
    -

    Description of the project: - Currently if you want to write a pass that works on a loop - nest, you have to pick from either a function pass or a loop pass. If you chose to write - it as a function pass, then you lose the ability to add loops dynamically back to the - pipeline. If you decide to write it as a loop pass, then you are wasting compile time to - traverse to your pass and return right away when the given loop is not the outermost - loop. In this project, we want to create a LoopNestPass, where transformations - intended for loop nest can inherit from it, and have the same ability as the LoopPass to - dynamically add loops to the pipeline. In addition, create all the adaptors requires to - add loop nest passes at different points of the pass builder. -

    -

    Expected results (possibilities): - Transformations/Analyses can be written as LoopNestPass, - without compromising compile time or usability. -

    -

    Confirmed Mentors: - Whitney Tsang, Ettore Tiotto -

    -

    Desirable skills: - Intermediate knowledge of C++, self-motivation. -

    -

    Preparation resources: - - -

    -
    - - - -
    - Instruction properties dumper and checker -
    - - -
    -

    Description of the project: - TableGen is flexible and allow the end-user to define and set common properties of - records (instructions). Every target has dozens or hundreds of such instruction - properties. As target code evolve, the td files become more and more complicated, - it become harder to see whether the setting of some properties is necessary, even - correct or not. eg: whether hasSideEffects property is correctly set on all - instructions? - - One can manually search through the TableGen-generated files; or write some - script to run TableGen and matching the output for some specific properties, but a - standalone utility that can dump and check instruction properties - systematically (eg: also allow target to define some verification rules) might be - better from a build-process-management standpoint. This can help to find quite - some hidden bugs and hence improve the overall codegen code quality. In - addition, the utility can be used to write regression tests for instruction - properties, which will increase the quality and precision of LLVM's - regression tests. -

    -

    Expected results (possibilities): - A standalone llvm tool or utility that can dump and check instruction properties systematically -

    -

    Confirmed Mentors: - Hal Finkel, Jinsong Ji , Qingshan Zhang -

    -

    Desirable skills: - Intermediate knowledge of C++, self-motivation. -

    -
    - - - -
    - Unify ways to move code or check if code is safe to be moved -
    - - -
    -

    Description of the project: - Determining whether it is safe to move code around is - implemented in several transformations in LLVM (e.g. canSinkOrHoistInst in LICM, - or makeLoopInvariant in Loop). Each of these implementations may return different - results for a given query, making code motion safety checks inconsistent and - duplicated. On the other hand, the mechanism for doing the actual code motion is also - different in each transformation. Code duplication causes maintenance problems and - increases the time taken to write new transformation. In this project, we want to first - identify all the existing ways in loop transformations (could be function or loop pass) - to check if code is safe to move, and to move code, and create a standardize way to do - so. -

    -

    Expected results (possibilities): - A standardize/superset of all the existing ways in loop - transformations of checking if code is safe to be moved and to move -

    - -

    Confirmed Mentors: - Whitney Tsang, Ettore Tiotto, Bardia Mahjour -

    -

    Desirable skills: - Intermediate knowledge of C++, self-motivation. -

    -

    Preparation resources: - - - -

    -
    - - -
    - MLIR -
    - - -
    -

    All the items in the list of -open projects -are opened to GSOC. Feel free to propose your own ideas as well on -Discourse. -

    - - -
    - Find null smart pointer dereferences - with the Static Analyzer -
    - - -
    -

    Description of the project: - The Clang Static Analyzer already knows how to prevent crashes caused by - null pointer dereference in arbitrary code, however it often "gives up" - when the code is too complicated. In particular, implementation details - of C++ standard classes, even simple ones such as smart pointers - or optionals, may be too convoluted for the Analyzer to fully understand. - Moreover, the exact behavior depends on which implementation of - the Standard Library is used (e.g., GNU libstdc++ or LLVM's own libc++). -

    -

    - We can enable the Analyzer to find more bugs in modern C++ code - by teaching it explicitly about the behavior of C++ standard classes, - and therefore skipping the whole process in which the Analyzer - tries to understand all the implementation details on its own. - For example, we could teach it that a default-constructed smart pointer - is null, and any attempt to dereference it would result in a crash. - The project would therefore consist in manually providing implementations - for various methods of standard classes. -

    - -

    Expected results: - We want the Static Analyzer to emit warnings when a null smart pointer - dereference would occur in the code. For example: -

    -    #include <memory>
    -
    -    int foo(bool flag) {
    -      std::unique_ptr<int> x;  // note: Default constructor produces a null unique pointer;
    -
    -      if (flag)                // note: Assuming 'flag' is false;
    -        return 0;              // note: Taking false branch
    -
    -      return *x;               // warning: Dereferenced smart pointer 'x' is null.
    -    }
    -    
    - We should be able to cover at least one class fully, for example, std::unique_ptr, - and then see if we can generalize our results to other classes, such as std::shared_ptr - or the C++17 std::optional. -

    - - -

    Confirmed Mentor: Artem Dergachev, Gábor Horváth

    - -

    Desirable skills: - Intermediate knowledge of C++. -

    -
    - - - -
    - LLDB -
    - - - -
    - Support autosuggestions in LLDB's command line -
    - - -
    -

    Description of the project: LLDB's command line offers several convenience - features that are inspired by features of UNIX shells such as tab completions or a command history. - One feature that is not implemented yet are 'autosuggestions'. These are suggestions - for possible commands that the user might want to type, but unlike tab completions they - are displayed directly behind the cursor while the user is typing a command. A good demonstration - how this could look like are the autosuggestions implemented in fish shell. -

    -

    - This project is about implementing autosuggestions in LLDB's editline-based command shell. -

    -

    Confirmed Mentor: - Jonas Devlieghere and Raphael Isemann

    -

    Desirable skills: - Intermediate knowledge of C++. -

    -
    - - -
    - Implement the missing tab completions for LLDB's command line -
    - - -
    -

    Description of the project: LLDB's command line offers several convenience - features that are inspired by features of UNIX shells such as tab completions for commands. - These tab completions are implemented by a completion engine that is not only used by the - command line interface of LLDB, but also by graphical interfaces for LLDB such as IDEs. - - While the tab completions in LLDB are really useful, they are currently not implemented for - all commands and their respective arguments. This project is about implementing the remaining - completions for the commands in LLDB which will greatly improve the user experience of LLDB. - Improving existing completions is also part of the project. - - Note that the completions are not static list of strings but often require inspecting and - understanding the internal state of LLDB. As LLDB commands and their tab completions cover - all aspects of LLDB, this project offers a great way to get an overview of all the functionality - in LLDB. -

    -

    Confirmed Mentor:Raphael Isemann

    - -

    Desirable skills: - Intermediate knowledge of C++. -

    -
    - - - -
    - Reimplement LLDB's command-line commands - using the public SB API. -
    - - -
    -

    Description of the project: Just as LLVM is a library to - build compilers, LLDB is a library to build debuggers. LLDB vends - a stable, public SB API. Due to historic reasons the LLDB command - line interface is currently implemented on top of LLDB's private - API and it duplicates a lot of functionality that is already - implemented in the public API. Rewriting LLDB's command line - interface on top of the public API would simplify the - implementation, eliminate duplicate code, and most importantly - reduce the testing surface. -

    -

    - This work will also provide an opportunity to clean up the SB API - of commands that have accrued too many overloads over time and - convert them to make use of option classes to both gather up all - the variants and also future-proof the APIs. -

    -

    Confirmed Mentor:Adrian Prantl and Jim Ingham

    - -

    Desirable skills: - Intermediate knowledge of C++. -

    -
    - - -
    - Add support for batch-testing to the LLDB - testsuite. -
    - - -
    -

    Description of the project: One of the tensions in the - testsuite is that spinning up a process and getting it to some - point is not a cheap operation, so you'd like to do a bunch of - tests when you get there. But the current testsuite bails at the - first failure, so you don't want to do many tests since the - failure of one fails all the others. On the other hand, there are - some individual test assertions where the failure of the assertion - should cause the whole test to fail. For example, if you - fail to stop at a breakpoint where you want to check some variable - values, then the whole test should fail. But if your test then - wants to check the value of five independent locals, it should be - able to do all five, and then report how many of the five variable - assertions failed. We could do this by adding Start - and End markers for a batch of tests, do all the tests in - the batch without failing the whole test, and then report the - error and fail the whole test if appropriate. There might also be - a nice way to do this in Python using scoped objects for the test - sections. -

    -

    Confirmed Mentor: Jim Ingham

    - -

    Desirable skills: - Intermediate knowledge of Python. -

    -
    - - -
    - Google Summer of Code 2019 -
    - - -
    -

    Google Summer of Code 2019 contributed a lot to the LLVM project. For the list of - accepted and completed projects, please take a look into Google Summer of Code - website. -

    -
    - - -
    - LLVM -
    - - - -
    - Debug Info should have no - effect on codegen -
    - - -
    -

    Description of the project: - Adding Debug Info (compiling with `clang -g`) shouldn't change the - generated code at all. Unfortunately we have bugs. These are usually not - too hard to fix and a good way to discover new part of the codebase! - We suggest building object files both ways and disassembling the - text sections, which will give cleaner diffs than comparing .s files. -

    - -

    Expected results: Reduced test cases, bug reports with analysis - (e.g., which pass is responsible), possibly patches.

    - -

    Confirmed Mentor: Paul Robinson

    -

    Desirable skills: Intermediate knowledge of C++, some familiarity - with x86 or ARM instruction set.

    -
    - - - -
    - Clang -
    - - - -
    - Implement an ASTImporter fuzzer -
    - - -
    -

    Description of the project: - Clang contains an ASTImporter which allows moving declarations and - statements from one Clang AST to another. This is for example used for - static analysis across translation units and in LLDB's expression - evaluator. -

    -

    - The current ASTImporter works as intended when moving simple C code from - one AST to another. However, more complicated declarations such as C++'s - OOP features and templates are not fully implemented and can cause crashes - or invalid AST nodes. The bug reports related to these crashes are often - filed against LLDB's expression evaluator and are rarely submited with a - minimal reproducer. This makes improving ASTImporter a time-consuming and - tedious task. -

    -

    - This project is about writing a fuzzer to proactively discover these - ASTImporter bugs and provide minimal reproducers which make understanding - and fixing the underlying bug easier. -

    -

    - A possible implementation of such a fuzzer and driver could look like this: - -

    - This is just one possible approach and students are welcome to submit their - own ideas on how the fuzzer should operate. Approaches that allow to - automatically verify more aspects of the imported AST (e.g. the source - locations of AST nodes, size of RecordDecls) are encouraged. The fuzzer and - driver should be implemented in C++ and/or Python. -

    -

    Confirmed Mentor: Raphael Isemann, Shafik Yaghmour

    -

    Desirable skills: Intermediate knowledge of C++.

    -
    - - -
    - Improve shell autocompletion for Clang -
    - - -
    -

    Description of the project: Clang has a newly implemented autocompletion feature which details can be found at LLVM blog. We would like to improve this by adding more flags to autocompletion, supporting more shells (currently it supports only bash) and exporting this feature to other projects such as llvm-opt. Accepted student will be working on Clang Driver, LLVM Options and shell scripts. -

    - -

    Expected Results: Autocompletion working on bash and zsh, support llvm-opt options.

    - -

    Confirmed Mentor: Yuka Takahashi and Vassil Vassilev

    - -

    Desirable skills: - Intermediate knowledge of C++ and shell scripting -

    -
    - - -
    - Improve Clang Diagnostics -
    - - -
    -

    Decription: - Clang diagnostics (warnings and errors) issues to the programmer are a critical - feature of the compiler. Great diagnostics can have a signifiant impact on the - user experience of the compiler and increase their productivity. -

    - -

    - Recent improvements in GCC 9.0 show that there is significant headroom to - improve diagnostics (and user interactions in general). It would be a very - impactful project to survey and identify all the possible improvements to clang - on this topic, and start resigning the next generation of our diagnostics. -

    - -

    Desirable skills: C++ coding experience

    -
    - - -
    - Google Summer of Code 2018 -
    - - -
    -

    Google Summer of Code 2018 contributed a lot to the LLVM project. For the list of - accepted and completed projects, please take a look into Google Summer of Code - website. -

    -
    - - -
    - Google Summer of Code 2017 -
    - - -
    -

    Google Summer of Code 2017 contributed a lot to the LLVM project. For the list of - accepted and completed projects, please take a look into Google Summer of Code - website. -

    -
    -
    What is this?