Learner0x5a's Studio.

基于源代码的机器学习

Word count: 370Reading time: 1 min
2020/12/14 Share

基于源代码的机器学习

ML4CODE

本领域主要关注三方面问题:

  * 代码生成
  * 代码表示
  * 模式识别

CodRep

代码表示学习的比赛,给定源代码,要求修补源代码中的bug

论文

程序生成

Synthetic Datasets for Neural Program Synthesis, ICLR 2019
Execution-Guided Neural Program Synthesis, ICLR 2019
DeepFuzz: Automatic Generation of Syntax Valid C Programs for Fuzz Testing,
AAAI 2019
Towards Synthesizing Complex Programs from Input-Output Examples, ICLR 2018

源代码分析及语言模型

Generative Code Modeling with Graphs, ICLR 2019
NL2Type: Inferring JavaScript Function Types from Natural Language
Information, ICSE 2019
A Novel Neural Source Code Representation based on Abstract Syntax Tree, ICSE
2019
Deep Learning Type Inference, FSE 2018
Learning to Represent Programs with Graphs, ICLR 2018
Are Deep Neural Networks the Best Choice for Modeling Source Code?, FSE 2017

神经网络结构与算法

Neural Code Comprehension: A Learnable Representation of Code Semantics,
NeurIPS 2018
A General Path-Based Representation for Predicting Program Properties, PLDI
2018
Cross-Language Learning for Program Classification using Bilateral Tree-Based
Convolutional Neural Networks, AAAI 2018

嵌入

A Literature Study of Embeddings on Source Code, arxiv
Deep Code Search, ICSE 2018
Code Vectors: Understanding Programs Through Embedded Abstracted Symbolic
Traces, FSE 2018

程序翻译

Towards Neural Decompilation, arxiv
Tree-to-tree Neural Networks for Program Translation, ICLR 2018
Code Attention: Translating Code to Comments by Exploiting Domain Features,
arxiv

代码补全

Aroma: Code Recommendation via Structural Code Search, arxiv

程序修复

Maximal Divergence Sequential Autoencoder for Binary Software Vulnerability
Detection, ICLR 2019
Neural Program Repair by Jointly Learning to Localize and Repair, ICLR 2019
Dynamic Neural Program Embedding for Program Repair, ICLR 2018

模式识别

SAR: Learning Cross-Language API Mappings with Little Knowledge, FSE 2019
Hierarchical Learning of Cross-Language Mappings through Distributed Vector
Representations for Code, ICSE 2018
Deep API Learning, FSE 2016

代码优化

Neural Nets Can Learn Function Type Signatures From Binaries, USENIX 2017

代码总结

Summarizing Source Code with Transferred API Knowledge, IJCAI 2018
A Neural Framework for Retrieval and Summarization of Source Code, ASE 2018

CATALOG
  1. 1. 基于源代码的机器学习
    1. 1.1. ML4CODE
    2. 1.2. CodRep
    3. 1.3. 论文
      1. 1.3.1. 程序生成
      2. 1.3.2. 源代码分析及语言模型
      3. 1.3.3. 神经网络结构与算法
      4. 1.3.4. 嵌入
      5. 1.3.5. 程序翻译
      6. 1.3.6. 代码补全
      7. 1.3.7. 程序修复
      8. 1.3.8. 模式识别
      9. 1.3.9. 代码优化
      10. 1.3.10. 代码总结