-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 886f091
Showing
75 changed files
with
17,297 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
The MIT License | ||
|
||
Copyright (c) 2019- Dana-Farber Cancer Institute | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining | ||
a copy of this software and associated documentation files (the | ||
"Software"), to deal in the Software without restriction, including | ||
without limitation the rights to use, copy, modify, merge, publish, | ||
distribute, sublicense, and/or sell copies of the Software, and to | ||
permit persons to whom the Software is furnished to do so, subject to | ||
the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be | ||
included in all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, | ||
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF | ||
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND | ||
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS | ||
BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN | ||
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN | ||
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
CC= gcc | ||
CFLAGS= -g -Wall -Wc++-compat -std=c99 -msse4 -O3 | ||
CPPFLAGS= | ||
INCLUDES= | ||
OBJS= kalloc.o kthread.o algo.o sys.o gfa-base.o gfa-io.o gfa-aug.o gfa-bbl.o gfa-ed.o \ | ||
sketch.o misc.o bseq.o options.o shortk.o miniwfa.o \ | ||
index.o lchain.o gchain1.o galign.o gcmisc.o map-algo.o cal_cov.o \ | ||
format.o gmap.o ggsimple.o ggen.o asm-call.o | ||
PROG= minigraph | ||
LIBS= -lz -lpthread -lm | ||
|
||
ifneq ($(asan),) | ||
CFLAGS+=-fsanitize=address | ||
LIBS+=-fsanitize=address -ldl | ||
endif | ||
|
||
.SUFFIXES:.c .o | ||
.PHONY:all clean depend | ||
|
||
.c.o: | ||
$(CC) -c $(CFLAGS) $(CPPFLAGS) $(INCLUDES) $< -o $@ | ||
|
||
all:$(PROG) | ||
|
||
minigraph:$(OBJS) main.o | ||
$(CC) $(CFLAGS) $^ -o $@ $(LIBS) | ||
|
||
clean: | ||
rm -fr gmon.out *.o a.out $(PROG) *~ *.a *.dSYM | ||
|
||
depend: | ||
(LC_ALL=C; export LC_ALL; makedepend -Y -- $(CFLAGS) $(DFLAGS) -- *.c) | ||
|
||
# DO NOT DELETE | ||
|
||
algo.o: kalloc.h algo.h miniwfa.h kvec-km.h ksort.h | ||
asm-call.o: mgpriv.h minigraph.h gfa.h ggen.h bseq.h gfa-priv.h algo.h | ||
bseq.o: bseq.h kvec-km.h kalloc.h kseq.h | ||
cal_cov.o: mgpriv.h minigraph.h gfa.h gfa-priv.h algo.h kalloc.h | ||
format.o: kalloc.h mgpriv.h minigraph.h gfa.h | ||
galign.o: mgpriv.h minigraph.h gfa.h kalloc.h miniwfa.h | ||
gchain1.o: mgpriv.h minigraph.h gfa.h ksort.h khashl.h kalloc.h gfa-priv.h | ||
gcmisc.o: mgpriv.h minigraph.h gfa.h kalloc.h | ||
gfa-aug.o: gfa-priv.h gfa.h ksort.h | ||
gfa-base.o: gfa-priv.h gfa.h kstring.h khashl.h kalloc.h ksort.h | ||
gfa-bbl.o: gfa-priv.h gfa.h kalloc.h ksort.h kvec.h | ||
gfa-ed.o: gfa-priv.h gfa.h kalloc.h ksort.h khashl.h kdq.h kvec-km.h | ||
gfa-io.o: kstring.h gfa-priv.h gfa.h kseq.h | ||
ggen.o: kthread.h kalloc.h sys.h bseq.h ggen.h minigraph.h gfa.h mgpriv.h | ||
ggen.o: gfa-priv.h | ||
ggsimple.o: mgpriv.h minigraph.h gfa.h gfa-priv.h kalloc.h bseq.h algo.h | ||
ggsimple.o: sys.h ggen.h kvec-km.h | ||
gmap.o: kthread.h kalloc.h bseq.h sys.h mgpriv.h minigraph.h gfa.h gfa-priv.h | ||
index.o: mgpriv.h minigraph.h gfa.h khashl.h kalloc.h kthread.h kvec-km.h | ||
index.o: sys.h | ||
kalloc.o: kalloc.h | ||
kthread.o: kthread.h | ||
lchain.o: mgpriv.h minigraph.h gfa.h kalloc.h krmq.h | ||
main.o: mgpriv.h minigraph.h gfa.h gfa-priv.h sys.h ketopt.h | ||
map-algo.o: kalloc.h mgpriv.h minigraph.h gfa.h khashl.h ksort.h | ||
miniwfa.o: miniwfa.h kalloc.h | ||
misc.o: mgpriv.h minigraph.h gfa.h ksort.h | ||
options.o: mgpriv.h minigraph.h gfa.h sys.h | ||
shortk.o: mgpriv.h minigraph.h gfa.h ksort.h kavl.h algo.h khashl.h kalloc.h | ||
sketch.o: kvec-km.h kalloc.h mgpriv.h minigraph.h gfa.h | ||
sys.o: sys.h |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,306 @@ | ||
Release 0.19-r551 (12 June 2022) | ||
-------------------------------- | ||
|
||
This release fixes a segmental fault if minigraph is compiled with certain | ||
compiler-libc combinations. This is apparently caused by memcpy(0,0,0). | ||
Minigraph is otherwise identical v0.18. | ||
|
||
(12 June 2022, r551) | ||
|
||
|
||
|
||
Release 0.18-r538 (9 May 2022) | ||
------------------------------ | ||
|
||
This release uses heuristics to speed up base alignment in long divergent | ||
regions. The heuristics does not guarantee optimal alignment but it reliably | ||
produces alignment close to the optimal except in centromeres where the | ||
algorithmically optimal alignment may not represent true evolution in biology. | ||
The new version is 10-700% faster than v0.17 depending on input data and | ||
parameters in use. | ||
|
||
(9 May 2022, r538) | ||
|
||
|
||
|
||
Release 0.17-r524 (29 April 2022) | ||
--------------------------------- | ||
|
||
This release adds base alignment to minigraph. It represents the first major | ||
improvement to minigraph. Specifically, this release attempts to connect linear | ||
chains with the graph wavefront alignemnt algorithm (GWFA) and produces the | ||
final alignment with miniwfa under the 2-piece gap penalty. Graph generation | ||
also considers base alignment. This gives more accurate graph alignment and | ||
generally simpler graph topology. Note that minigraph still focuses on | ||
structural variations and does not generate base-level graphs. To endusers, | ||
minigraph remains similar feature wise. | ||
|
||
Notable changes: | ||
|
||
* New feature: option `-c` for base alignment and graph generation. In the | ||
alignment mode, the option adds the `cg:Z` CIGAR tag like minimap2. Graph | ||
generation still works without `-c` but applying this option is generally | ||
recommended now. | ||
|
||
It should be noted that the base alignment is currently slow for species of | ||
high diversity. This will be addressed in the next couple of releases. | ||
|
||
(29 April 2022, r524) | ||
|
||
|
||
|
||
Release 0.16-r436 (21 February 2022) | ||
------------------------------------ | ||
|
||
Notable changes: | ||
|
||
* Improvement: 2-level chaining. This is a feature backported from minimap2. | ||
It speeds up graph generation for human graphs. | ||
|
||
* Improvement: break a chain at poorly aligned regions, another recent | ||
minimap2 feature. | ||
|
||
* Added the script for generating figures in the minigraph paper. | ||
|
||
(21 February 2022, r436) | ||
|
||
|
||
|
||
Release 0.15-r426 (21 March 2021) | ||
--------------------------------- | ||
|
||
Fixed a bug in bubble identification around inversions. This version should be | ||
used together with the latest gfatools for consistency. | ||
|
||
(21 March 2021, r426) | ||
|
||
|
||
|
||
Release 0.14-r415 (19 December 2020) | ||
------------------------------------ | ||
|
||
Notable changes: | ||
|
||
* Added the `--call` option to find the allele/walk in each bubble. | ||
|
||
* Reduced the default minimum variant length (option `-L`) from 100 to 50 for | ||
the consistency with the SV community. | ||
|
||
(19 December 2020, r415) | ||
|
||
|
||
|
||
Release 0.13-r397 (3 December 2020) | ||
----------------------------------- | ||
|
||
Notable change: | ||
|
||
* Fixed incorrect anchors in linear chains. In older versions, a linear chain | ||
may contain two anchors with identical reference or query coordinates. | ||
|
||
(3 December 2020, r397) | ||
|
||
|
||
|
||
Release 0.12-r389 (26 October 2020) | ||
----------------------------------- | ||
|
||
Notable changes: | ||
|
||
* Improve alignments towards ends of graph segments. If there is an SV close to | ||
the ends but not at the ends, older versions may produce an excessively | ||
large bubble including high-identity matches. | ||
|
||
* Heuristically accelerates alignment in complex subgraphs by skipping | ||
many unnecessary sequence-aware graph traversals. This speeds up graph | ||
generation for CHM13 by three folds without obviously affecting accuracy. | ||
|
||
* Added option --inv to optionally disable inversions. Graph traversal is hard | ||
with inversions. | ||
|
||
* Fixed the bug that prevents large -K. | ||
|
||
* Apply option -K4g to the asm preset. | ||
|
||
* Added option --write-mz to output the positions of minimizer anchors. | ||
|
||
(26 October 2020, r389) | ||
|
||
|
||
|
||
Release 0.11-r371 (13 September 2020) | ||
------------------------------------- | ||
|
||
Notable changes: | ||
|
||
* Added option --max-rmq-size to limit the max RMQ size, which is set 100k by | ||
default. This heuristic reduces the long running time for aligning long | ||
centromeric sequences. The accuracy might be affected in rare cases. | ||
|
||
* Cap the max k-mer occurrence to 250 by default. For maize genomes, the | ||
current heuristic may choose an occurrence cutoff larger than 1000. This | ||
makes minigraph too slow to be practical. | ||
|
||
* Added option -S to output more detailed information about linear chains. | ||
|
||
* Added option -D to ignore diagonal minimizer anchors. This is useful to | ||
mapping a sequence against itself. | ||
|
||
(13 September 2020, r371) | ||
|
||
|
||
|
||
Release 0.10-r356 (14 February 2020) | ||
------------------------------------ | ||
|
||
Notable changes: | ||
|
||
* Older releases miss a small fraction of INDELs involving repeats. This | ||
release fixes this issue. | ||
|
||
* Added the "stableGaf" command to mgutils.js to convert unstable GAF (e.g. by | ||
GraphAligner) to stable GAF. | ||
|
||
(14 February 2020, r356) | ||
|
||
|
||
|
||
Release 0.9-r343 (31 December 2019) | ||
----------------------------------- | ||
|
||
Notable changes: | ||
|
||
* RMQ based linear chaining. The chaining accuracy should be higher for large | ||
events. The speed remains similar. | ||
|
||
* Use ksw2 to check the sequence divergence of events to be inserted. | ||
|
||
* Treat inversions as special events. Don't insert them as long substitutions. | ||
|
||
(31 December 2019, r343) | ||
|
||
|
||
|
||
Release 0.8-r316 (11 December 2019) | ||
----------------------------------- | ||
|
||
This release reduces suboptimal chains caused by the chaining heuristics. It | ||
generates slightly simpler human graphs. | ||
|
||
(11 December 2019, r316) | ||
|
||
|
||
|
||
Release 0.7-r310 (21 November 2019) | ||
----------------------------------- | ||
|
||
Notable changes: | ||
|
||
* Increased the default maximum INDEL/event length from 10kb to 100kb for | ||
assembly mapping and graph generation. | ||
|
||
* Decreased the default minimum INDEL/event length from 250bp to 100bp. | ||
|
||
* Accelerated graph mapping by pre-filtering isolated anchors and disconnected | ||
linear chains. This triples the performance when long gaps are desired. | ||
|
||
Due to the change of default parameters, this release generates graphs | ||
different from the previous versions. | ||
|
||
(21 November 2019, r310) | ||
|
||
|
||
|
||
Release 0.6-r302 (17 November 2019) | ||
----------------------------------- | ||
|
||
Notable changes: | ||
|
||
* Assign weight to seeds based on their repetitiveness. This helps chaining in | ||
repetitive regions a little bit. | ||
|
||
* For short-read mapping, prefer the reference path if the alternate path is | ||
not much better. | ||
|
||
Major changes may be coming in the next release. | ||
|
||
(17 November 2019, r302) | ||
|
||
|
||
|
||
Release 0.5-r285 (8 September 2019) | ||
----------------------------------- | ||
|
||
Notable changes: | ||
|
||
* Fixed a bug that leads to wrong mapping positions in GAF. | ||
|
||
* Fixed two bugs related to graph chaining. | ||
|
||
* Added option `-j` to set expected sequence divergence and to adjust other | ||
chaining parameters accordingly. | ||
|
||
* Increased the k-mer thresholds for fast divergence estimate. This improves | ||
the alignment around low-complexity regions. | ||
|
||
* Tuned the default parameters to add highly divergent events only. | ||
|
||
* Warn about duplicated sequence names in graph construction (#3). | ||
|
||
This version generates graphs different from the previous versions. The mapping | ||
accuracy is improved due to the bug fixes and parameter tuning. | ||
|
||
(8 September 2019, r285) | ||
|
||
|
||
|
||
Release 0.4-r267 (22 August 2019) | ||
--------------------------------- | ||
|
||
Notable changes: | ||
|
||
* Support paired-end mapping for short reads. | ||
|
||
* Remap and calculate coverage (see the new --cov option in the manpage). | ||
|
||
* Fixed multiple edges in the generated graphs. The v0.3 14-genome graph | ||
contains one multiple edge. | ||
|
||
* Use dynamic minimizer occurrence cutoff. For human data, the dynamic cutoff | ||
is around 137, higher than the default cutoff 100 used in earlier versions. | ||
As a result, graph generations will become a little slower. | ||
|
||
Due to the last two changes, graphs generated with this version are different | ||
from the previous versions. | ||
|
||
(22 August 2019, r267) | ||
|
||
|
||
|
||
Release 0.3-r243 (7 August 2019) | ||
-------------------------------- | ||
|
||
This release generates graphs with SR tags on L-lines. The topology of the | ||
graph is identical to the one generated with v0.2. | ||
|
||
(7 August 2019, r243) | ||
|
||
|
||
|
||
Release 0.2-r235 (19 July 2019) | ||
------------------------------- | ||
|
||
This release fixes multiple minor bugs. It also considers k-mer matches and | ||
improves the accuracy of graph chaining. Nonetheless, the old chaining | ||
algorithm, albeit simple, works quite well. The improvement is marginal. | ||
|
||
(19 July 2019, r235) | ||
|
||
|
||
|
||
Release 0.1-r191 (6 July 2019) | ||
------------------------------ | ||
|
||
Initial proof-of-concept release. | ||
|
||
(6 July 2019, r191) |
Oops, something went wrong.