TMSC

TMSC (Topics Modeling on Source Code) is a command line application to discover the topics of a repository the user provides. A "topic" is a set of keywords, in this case source code identifiers, which typically occur together. This project has nothing to do with GitHub topics.

$ tmsc https://github.com/apache/spark
...
                Parallel and distributed processing - General IT	4.43
                Machine Learning, sklearn-like APIs - General IT	3.87
               Java/JS + async + JSON serialization - General IT	3.58
                Java string input/output - Programming languages	3.29
                            Cryptography: libraries - General IT	3.23
                        SQL, working with databases - General IT	3.11
                          Java: Spring, Hibernate - Technologies	3.09
                              Operations on numbers - General IT	2.98
                               Distributed clusters - General IT	2.62
           Functional programming, Scala - Programming languages	2.60

Automatic topic inference can be useful for cataloging repositories or mining concepts from them. The current model was trained on GitHub repositories cloned in October 2016 after de-fuzzy-forking. There is a paper on it.

Installation

pip3 install tmsc

Usage

Command line:

$ tmsc https://github.com/apache/spark

Python API:

import tmsc

engine = tmsc.Topics()
print(engine.query("https://github.com/apache/spark"))

Docker image

docker build -t srcd/tmsc
docker run -d --privileged -p 9432:9432 --name bblfshd bblfsh/bblfshd
docker exec -it bblfshd bblfshctl driver install --recommended
docker run -it --rm srcd/tmsc https://github.com/apache/spark

In order to cache the downloaded models:

docker run -it --rm -v /path/to/cache/on/host:/root srcd/tmsc https://github.com/apache/spark

Contributions

...are welcome! See CONTRIBUTING and code of conduct.

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
tmsc		tmsc
.gitignore		.gitignore
.travis.yml		.travis.yml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DCO		DCO
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
MAINTAINERS		MAINTAINERS
README.md		README.md
labeling_320.ods		labeling_320.ods
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TMSC

Installation

Usage

Docker image

Contributions

License

About

Releases

Packages

Contributors 5

Languages

License

src-d/tmsc

Folders and files

Latest commit

History

Repository files navigation

TMSC

Installation

Usage

Docker image

Contributions

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages