A linguist-inspired language classifier with multiple file source handlers
Feature/Behavior | linguist | gengo |
---|---|---|
Analyze Git Revision | Yes | Yes |
Analyze Directory | No | Yes |
Requires Git Repository | Yes | No |
Detect Language by Extension | Yes | Yes |
Detect Language by Filename | Yes | Yes |
Detect by Filepath Pattern | No | Yes |
Detect Language with Heuristics | Yes | Yes |
Detect Language with Classifier | Yes | Not Yet ;) |
View the installation documentation.
This tool has multiple file sources. Each file source can have unique usage to take advantage of its strengths and work around its weaknesses.
This is a very generic file source that tries not to make many assumptions about your environment and workspace.
You can utilize a .gitignore
file and/or an .ignore
file to prevent files from
being scanned. See the ignore
for more details.
The git file source is highly opinionated -- it tries to act like a git utility, and uses git tools. Its goal is to behave similarly to linguist. This means that this file source does not need any actual files present, and can work on a bare repository, making it suitable for usage with a Git server.
Like linguist, you can override behavior using a .gitattributes
file.
Basically, just replace linguist-FOO
with gengo-FOO
. Unlike linguist,
gengo-detectable
will always make a file be included in statistics (linguist
will still exclude them if they're generated or vendored).
# .gitattributes
# boolean attributes:
# These can be *negated* by prefixing with `-` (`-gengo-documentation`).
# Mark a file as documentation
*.html gengo-documentation
# Mark a file as generated
my-built-files/* gengo-generated
# Mark a file as vendored
deps/* gengo-vendored
# string attributes:
# Override the detected language for a file
# Use the Language enum's variant name (see docs.rs for more details)
templates/*.js gengo-language=PlainText
You will need to commit your .gitattributes
file for it to take effect.