Polish code with flake8 output & add functional test 5 and 6 #63

XingerTang · 2023-08-28T16:01:54Z

Polish code according to AlphaGenes/tinyhouse#5
Add functional tests 5 and 6 #57 #58

I fix the test 6 by adding a result-checking part.
For test 5, originally test 5 did not have any data, so I copied the data of test 1 to test 5, and I set the expected data to check against the output of the command of the input data. Thus, I don't know if the code behaved as expected, maybe we need to assess the validity of the data later.

gregorgorjanc · 2023-08-29T07:08:38Z

src/tinypeel/Peeling/PeelingUpdates.py

@@ -241,64 +235,64 @@ def updateSeqError_ind(counts, errors, refReads, altReads, genoProbs):
 #


-def updateSeg(peelingInfo):


@XingerTang these edits to getNewton() and related functions in this PeelingUpdates.py file should not be in this PR - best to keep each PR simple and doing mostly one thing - polishing flake8 and tests is fine, but you are doing something else here,

@gregorgorjanc I comment this part because this whole part of the code is broken (more details about it in the report I wrote), and updateSeg() is trying to call a function setupTransmission() which is not defined so the flake8 complained about this. So, it is flake8 related but I did do some extra here, I can uncomment the functions other than setupTransmission() if it is better.

@gregorgorjanc I comment this part because this whole part of the code is broken (more details about it in the report I wrote), and updateSeg() is trying to call a function setupTransmission() which is not defined so the flake8 complained about this. So, it is flake8 related but I did do some extra here, I can uncomment the functions other than setupTransmission() if it is better.

I see! Hmm, odd, BUT good find!

Please open an issue so we have a history of such changes in the repository with a clear statement of what we should do and then PRs solving the issue are linked to the issues through "hash-links".

Then let's solve this as part of that other issue. Best to work one step at a time;)

gregorgorjanc · 2023-08-29T07:19:18Z

I fix the test 6 by adding a result-checking part.

Test 6 is for the sex-chromosome functionality (#58). Do we have sex-chromosome genotype or sequence data - I don't see these files in this PR (are they already in the git repo from before)?

For test 5, originally test 5 did not have any data, so I copied the data of test 1 to test 5, and I set the expected data to check against the output of the command of the input data. Thus, I don't know if the code behaved as expected, maybe we need to assess the validity of the data later.

Yeah, for test 5 we should look at the error-rate functionality (#57), so we need to discuss how to do a reasonable testing case! Do you have any suggestion how to do this in a simple way? Perhaps adding some errors into the input data compared to the true data and test that AlphaPeel recovers the true state?

XingerTang · 2023-08-29T08:28:41Z

I fix the test 6 by adding a result-checking part.

Test 6 is for the sex-chromosome functionality (#58). Do we have sex-chromosome genotype or sequence data - I don't see these files in this PR (are they already in the git repo from before)?

For test 5, originally test 5 did not have any data, so I copied the data of test 1 to test 5, and I set the expected data to check against the output of the command of the input data. Thus, I don't know if the code behaved as expected, maybe we need to assess the validity of the data later.

Yeah, for test 5 we should look at the error-rate functionality (#57), so we need to discuss how to do a reasonable testing case! Do you have any suggestion how to do this in a simple way? Perhaps adding some errors into the input data compared to the true data and test that AlphaPeel recovers the true state?

@gregorgorjanc Fort test 6, the sex-related part is from the command and the input file pedigree.txt:

AlphaPeel -genotypes test6/genotypes.txt \
                          -seqfile test6/seqfile.txt \
                          -pedigree test6/pedigree.txt \
                          -runType multi \
                          -calling_threshold .1 \
                          -sexchrom \
                          -out test6/outputs/output

where -sexchrom indicates that this is a sex chromosome and

genotypes_0 0 0 M
genotypes_2 0 0 F
genotypes_M genotypes_0 genotypes_2 M 
genotypes_F genotypes_0 genotypes_2 F
sequence_0 0 0 M
sequence_2 0 0 F
sequence_M0 sequence_0 sequence_2 M 
sequence_F0 sequence_0 sequence_2 F
sequence_M9 sequence_0 sequence_2 M 
sequence_F9 sequence_0 sequence_2 F

gives the sex information.

XingerTang · 2023-08-29T08:58:10Z

@gregorgorjanc For test 5, if we add some errors to the input data and compare the output to the true state, I'm wondering if the corrupted data can be recovered 100% to the true state, if not then how can we know which part of the loss of the accuracy is not error rate related and which part is caused by the input error rate. If we cannot separate these two cases then we won't know whether the effect of the input error rate is as expected. And for those that can return to the true state 100%, we can't tell if the accuracy is just a coincidence or a result of the correct error rate.
So, I prefer to prepare simple input data and compute the expected outcome by hand to see whether the computed and expected outcomes match.

gregorgorjanc · 2023-08-29T13:33:15Z

Fort test 6, the sex-related part is from the command and the input file pedigree.txt:

AlphaPeel -genotypes test6/genotypes.txt \
                          -seqfile test6/seqfile.txt \
                          -pedigree test6/pedigree.txt \
                          -runType multi \
                          -calling_threshold .1 \
                          -sexchrom \
                          -out test6/outputs/output

where -sexchrom indicates that this is a sex chromosome and

genotypes_0 0 0 M
genotypes_2 0 0 F
genotypes_M genotypes_0 genotypes_2 M 
genotypes_F genotypes_0 genotypes_2 F
sequence_0 0 0 M
sequence_2 0 0 F
sequence_M0 sequence_0 sequence_2 M 
sequence_F0 sequence_0 sequence_2 F
sequence_M9 sequence_0 sequence_2 M 
sequence_F9 sequence_0 sequence_2 F

gives the sex information.

@XingerTang I understand, BUT have you looked if the input and output make sense in the context of sex chromosome - females have two copies of X chromosome so they can have allele dosage of 0, 1, or 2, while males have one copy of X, so they can have allele dosage of 0 or 1. At the moment, we ignore the PAR (pseudo-autosomal region) of the Y chromosome, where males can have allele dosage of 0, 1, or 2 (will open an issue for this for future work - not a priority now).

gregorgorjanc · 2023-08-29T13:49:58Z

@gregorgorjanc For test 5, if we add some errors to the input data and compare the output to the true state, I'm wondering if the corrupted data can be recovered 100% to the true state, if not then how can we know which part of the loss of the accuracy is not error rate related and which part is caused by the input error rate. If we cannot separate these two cases then we won't know whether the effect of the input error rate is as expected. And for those that can return to the true state 100%, we can't tell if the accuracy is just a coincidence or a result of the correct error rate.
So, I prefer to prepare simple input data and compute the expected outcome by hand to see whether the computed and expected outcomes match.

I moved this discussion to #57

XingerTang · 2023-08-29T14:06:40Z

Fort test 6, the sex-related part is from the command and the input file pedigree.txt:
AlphaPeel -genotypes test6/genotypes.txt \
                          -seqfile test6/seqfile.txt \
                          -pedigree test6/pedigree.txt \
                          -runType multi \
                          -calling_threshold .1 \
                          -sexchrom \
                          -out test6/outputs/output
where -sexchrom indicates that this is a sex chromosome and
genotypes_0 0 0 M
genotypes_2 0 0 F
genotypes_M genotypes_0 genotypes_2 M 
genotypes_F genotypes_0 genotypes_2 F
sequence_0 0 0 M
sequence_2 0 0 F
sequence_M0 sequence_0 sequence_2 M 
sequence_F0 sequence_0 sequence_2 F
sequence_M9 sequence_0 sequence_2 M 
sequence_F9 sequence_0 sequence_2 F
gives the sex information.
@XingerTang I understand, BUT have you looked if the input and output make sense in the context of sex chromosome - females have two copies of X chromosome so they can have allele dosage of 0, 1, or 2, while males have one copy of X, so they can have allele dosage of 0 or 1. At the moment, we ignore the PAR (pseudo-autosomal region) of the Y chromosome, where males can have allele dosage of 0, 1, or 2 (will open an issue for this for future work - not a priority now).

I checked the input genotypes and the sequence file, and almost every entry is filled with 0 or 9 except the genotypes_2, which has 2 on every locus, but the individual is a female, so the input data is fine. However, the output data (both the output generated by AlphaPeel and the expected output written a long time ago) contains male individuals with an allele dosage of 2. The AlphaPeel documentation marked the -sexchrom as an experimental option, so it might be still under development and requires further refinement.

gregorgorjanc · 2023-08-29T14:21:53Z

I checked the input genotypes and the sequence file, and almost every entry is filled with 0 or 9 except the genotypes_2, which has 2 on every locus, but the individual is a female, so the input data is fine. However, the output data (both the output generated by AlphaPeel and the expected output written a long time ago) contains male individuals with an allele dosage of 2. The AlphaPeel documentation marked the -sexchrom as an experimental option, so it might be still under development and requires further refinement.

@XingerTang ok, I propose to get this PR in and we open a new issue to address the sexchrom output, but I think this also raises the fact that we don't have sufficient information about these tests to really trust them.

gregorgorjanc · 2023-08-29T14:31:23Z

@XingerTang resolve the stuff mentioned in #63 (comment) and I will merge this PR.

gregorgorjanc · 2023-08-29T16:24:14Z

Merging this in to move to the next issues, some discovered as part of this PR work.

XingerTang and others added 4 commits August 28, 2023 15:42

Clean up code with flake8 report

ed30847

Add func test 6

34c20e2

Add func test 5

af76cca

Merge branch 'AlphaGenes:devel' into devel

7436251

XingerTang changed the title ~~Polish code with black8 output & add functional test 5 and 6~~ Polish code with flack8 output & add functional test 5 and 6 Aug 28, 2023

XingerTang changed the title ~~Polish code with flack8 output & add functional test 5 and 6~~ Polish code with flake8 output & add functional test 5 and 6 Aug 28, 2023

gregorgorjanc reviewed Aug 29, 2023

View reviewed changes

gregorgorjanc mentioned this pull request Aug 29, 2023

Add/fix test5 for error rate functionality #57

Open

gregorgorjanc mentioned this pull request Aug 29, 2023

Transmission rate update broken #65

Open

gregorgorjanc mentioned this pull request Aug 29, 2023

Revise sex chromosome functionality #66

Open

uncomment transmission rate related functions

d93a3f8

gregorgorjanc merged commit 466e303 into AlphaGenes:devel Aug 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Polish code with flake8 output & add functional test 5 and 6 #63

Polish code with flake8 output & add functional test 5 and 6 #63

XingerTang commented Aug 28, 2023

gregorgorjanc Aug 29, 2023

XingerTang Aug 29, 2023

gregorgorjanc Aug 29, 2023

gregorgorjanc commented Aug 29, 2023

XingerTang commented Aug 29, 2023

XingerTang commented Aug 29, 2023

gregorgorjanc commented Aug 29, 2023

gregorgorjanc commented Aug 29, 2023

XingerTang commented Aug 29, 2023

gregorgorjanc commented Aug 29, 2023

gregorgorjanc commented Aug 29, 2023

gregorgorjanc commented Aug 29, 2023

		@@ -241,64 +235,64 @@ def updateSeqError_ind(counts, errors, refReads, altReads, genoProbs):
		#


		def updateSeg(peelingInfo):

Polish code with flake8 output & add functional test 5 and 6 #63

Polish code with flake8 output & add functional test 5 and 6 #63

Conversation

XingerTang commented Aug 28, 2023

gregorgorjanc Aug 29, 2023

Choose a reason for hiding this comment

XingerTang Aug 29, 2023

Choose a reason for hiding this comment

gregorgorjanc Aug 29, 2023

Choose a reason for hiding this comment

gregorgorjanc commented Aug 29, 2023

XingerTang commented Aug 29, 2023

XingerTang commented Aug 29, 2023

gregorgorjanc commented Aug 29, 2023

gregorgorjanc commented Aug 29, 2023

XingerTang commented Aug 29, 2023

gregorgorjanc commented Aug 29, 2023

gregorgorjanc commented Aug 29, 2023

gregorgorjanc commented Aug 29, 2023