Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

~ 35.63% faster filtering #3330

Draft
wants to merge 118 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
118 commits
Select commit Hold shift + click to select a range
25e7cd4
updated for filter_cell
kaushalprasadhial Oct 25, 2024
fd6dd14
print statment removal
kaushalprasadhial Nov 15, 2024
a54e541
[pre-commit.ci] pre-commit autoupdate (#3119)
pre-commit-ci[bot] Jul 1, 2024
89aff86
(chore): add preparation-of-release documentation (#3122)
ilan-gold Jul 2, 2024
3bb4a73
[pre-commit.ci] pre-commit autoupdate (#3131)
pre-commit-ci[bot] Jul 2, 2024
b5337ea
Unpin numpy 2 (#3115)
flying-sheep Jul 2, 2024
ee1e3d9
Bugfix: Gene score edge case where gene_list gene is chosen as contro…
mumichae Jul 4, 2024
48f75b2
Use version guards instead of “except ImportError” (#3145)
flying-sheep Jul 8, 2024
3b6983d
[pre-commit.ci] pre-commit autoupdate (#3148)
pre-commit-ci[bot] Jul 11, 2024
2a1f001
fix layer use_raw (#3150)
Intron7 Jul 12, 2024
57fd1d7
Revert "fix layer use_raw (#3150)" (#3154)
Intron7 Jul 12, 2024
30350ec
[pre-commit.ci] pre-commit autoupdate (#3156)
pre-commit-ci[bot] Jul 23, 2024
1474a90
Allow all valid legend_loc options (#3163)
flying-sheep Jul 25, 2024
3aff03a
Fix tests for dask PCA (#3162)
flying-sheep Jul 25, 2024
b5ec2c5
Run benchmarks for off axis (#3147)
flying-sheep Jul 25, 2024
af38813
Fix `layers` parameter in `score_genes` with `.raw` (#3155)
Intron7 Jul 25, 2024
cf7c2b8
Refactor score_genes (#3170)
flying-sheep Jul 26, 2024
2deca56
Switch from using rubric in release notes (#3172)
flying-sheep Jul 26, 2024
df0ac25
[pre-commit.ci] pre-commit autoupdate (#3174)
pre-commit-ci[bot] Jul 30, 2024
e96cde2
Fix NaN dispersion (#3176)
flying-sheep Jul 30, 2024
6a421f7
Cache data for subsequent test runs (#3177)
flying-sheep Jul 30, 2024
4861996
Some refactoring ahead of key_added (#3182)
flying-sheep Aug 1, 2024
352061e
Auto-metric and fix logs (#3186)
flying-sheep Aug 1, 2024
f4a8dc0
Add Ruff FBT (#3189)
flying-sheep Aug 1, 2024
48d9566
Add Ruff pytest-style (#3191)
flying-sheep Aug 1, 2024
5afa91e
Add key_added to umap, tsne, and pca (#3184)
flying-sheep Aug 2, 2024
f7c9c97
Fix [source] links (#3194)
flying-sheep Aug 5, 2024
7b133d4
[pre-commit.ci] pre-commit autoupdate (#3197)
pre-commit-ci[bot] Aug 6, 2024
ffd3a60
(fix): resolve data ordering to match axis for stacked violin plots (…
ilan-gold Aug 6, 2024
06d1bb3
[pre-commit.ci] pre-commit autoupdate (#3207)
pre-commit-ci[bot] Aug 13, 2024
ad496e9
[pre-commit.ci] pre-commit autoupdate (#3210)
pre-commit-ci[bot] Aug 22, 2024
87b6d13
(fix): Upper bound `dask` (#3217)
ilan-gold Aug 30, 2024
38cc893
Update notebooks (#3216)
flying-sheep Sep 2, 2024
567b1e2
[pre-commit.ci] pre-commit autoupdate (#3213)
pre-commit-ci[bot] Sep 2, 2024
2f6720b
[pre-commit.ci] pre-commit autoupdate (#3225)
pre-commit-ci[bot] Sep 13, 2024
ab16928
fa2 library changed to fa2_modified (#3220)
AminAlam Sep 13, 2024
0d65925
[pre-commit.ci] pre-commit autoupdate (#3232)
pre-commit-ci[bot] Sep 17, 2024
c62e4cb
Switch to towncrier (#3231)
flying-sheep Sep 17, 2024
531703b
Fix towncrier git CLI call (#3236)
flying-sheep Sep 17, 2024
3ba6c0c
Backport PR #3235 on branch main ((chore): generate 1.10.3 release no…
meeseeksmachine Sep 17, 2024
c9685a5
Fix release note building and check (#3239)
flying-sheep Sep 17, 2024
0bd71d0
impl median function for aggregation (#3180)
farhadmd7 Sep 17, 2024
7fef08f
Upload scrublet scores on test failure (#3069)
ilan-gold Sep 19, 2024
ab6547f
Fix stacked_violin’s `standard_scale` parameter (#3243)
flying-sheep Sep 20, 2024
06a7cd1
Finish `scale`→`density_norm` deprecation (#3244)
flying-sheep Sep 20, 2024
7552cd0
Rely on Ruff for TYPE_CHECKING block mgmt (#3248)
flying-sheep Sep 20, 2024
36ec3a6
Clean up dendrogram typing (#3249)
flying-sheep Sep 20, 2024
d77a755
Deprecate defunct `order` parameter in `stacked_violin` (#3252)
flying-sheep Sep 20, 2024
b4fc78e
Fix *Plot.style() methods (#3206)
flying-sheep Sep 20, 2024
4a61039
[pre-commit.ci] pre-commit autoupdate (#3256)
pre-commit-ci[bot] Sep 24, 2024
876828a
Add `SIM` checks (#3258)
flying-sheep Sep 24, 2024
fb2d8d5
Fix compat typing and old_positionals usage (#3264)
flying-sheep Sep 26, 2024
5dd2d09
Split up PCA tests (#3268)
flying-sheep Sep 30, 2024
b54822f
[pre-commit.ci] pre-commit autoupdate (#3270)
pre-commit-ci[bot] Oct 1, 2024
5bdf305
Remove 3.9 support (#3283)
flying-sheep Oct 15, 2024
66ee0c9
Fix #3206’s release note (#3287)
flying-sheep Oct 15, 2024
b48bb97
(fix): conditional imports to avoid `anndata.io` warning (#3289)
ilan-gold Oct 17, 2024
861c81b
Fix benchmark job: Use upstream asv (#3292)
flying-sheep Oct 17, 2024
fead3b5
Use upstream sklearn PCA if possible (#3267)
flying-sheep Oct 18, 2024
3020094
Test all PCA param combinations (#3294)
flying-sheep Oct 18, 2024
c2f9407
Add explicit support to PCA for `'covariance_eigh'` svd_solver (#3296)
flying-sheep Oct 18, 2024
1448822
Implement sparse `covariance_eigh` PCA using Dask (#3263)
flying-sheep Oct 18, 2024
3920154
Fix HVG with 1-obs batches (#3286)
flying-sheep Oct 21, 2024
f804367
Allow specifying a collection of colors to scatterplots (#3299)
flying-sheep Oct 21, 2024
3b68bab
(fix): correct anndata release for `io` usage (#3298)
ilan-gold Oct 22, 2024
c2a615b
[pre-commit.ci] pre-commit autoupdate (#3274)
pre-commit-ci[bot] Oct 22, 2024
f860829
(fix): clarify sparse pca usage (#3306)
ilan-gold Oct 22, 2024
a38bf01
Fix sc.pl.highest_expr_genes with a categorical column (#3302)
flying-sheep Oct 22, 2024
5d9505d
Fix some `Returns` docstrs re: `inplace` semantics (#3311)
ryan-williams Oct 23, 2024
6990065
Catch PerfectSeparationWarning during regress_out (#3275)
jeskowagner Oct 24, 2024
9658e07
Refactor regress_out (#3316)
flying-sheep Oct 25, 2024
ced5008
Enforce `np.bool_` usage via Ruff (#3321)
flying-sheep Oct 25, 2024
233fed1
Update `test_rank_genes_groups.py` reference (#3285)
emmanuel-ferdman Oct 28, 2024
44bd465
Support `layer` in `sc.pl.highest_expr_genes` (#3324)
flying-sheep Oct 31, 2024
5a02a07
Align `get.obs_df`’s docs with its code (#3328)
flying-sheep Oct 31, 2024
9d66874
[pre-commit.ci] pre-commit autoupdate (#3329)
pre-commit-ci[bot] Nov 5, 2024
ccfe2c9
(fix): sort pca test args (#3333)
ilan-gold Nov 5, 2024
d244371
Add PYI lints (#3339)
flying-sheep Nov 5, 2024
f3e5d4e
(feat): `calculate_qc_metrics` with `dask` (#3307)
ilan-gold Nov 7, 2024
cad568a
Fix docs (#3343)
flying-sheep Nov 7, 2024
ee134c5
move all `njit` calls into a decorator (#3335)
flying-sheep Nov 8, 2024
fb68987
Update notebooks (#3349)
flying-sheep Nov 11, 2024
dea952e
Speedup (~20x) of scanpy.pp.regress_out function using Linear Least S…
kaushalprasadhial Nov 11, 2024
83991eb
Fix zappy compatibility for clip_array (#3317)
flying-sheep Nov 11, 2024
1b7d8f8
[pre-commit.ci] pre-commit autoupdate (#3354)
pre-commit-ci[bot] Nov 12, 2024
c6a5e58
Backport PR #3357 on branch main ((chore): generate 1.10.4 release no…
meeseeksmachine Nov 12, 2024
6705928
Actually working min-deps job (#3337)
flying-sheep Nov 14, 2024
88d0564
Fix CI (#3364)
flying-sheep Nov 14, 2024
5d7efdb
[pre-commit.ci] pre-commit autoupdate (#3373)
pre-commit-ci[bot] Nov 19, 2024
ba7dea8
Deprecate RandomState (using names only) (#3372)
flying-sheep Nov 19, 2024
28e2b01
Updated Harmony Integrate Docs to better match interface to Harmonypy…
DaminK Nov 19, 2024
1d71eb1
Use deprecation decorator (#3380)
flying-sheep Nov 22, 2024
f74263c
(fix): bound sklearn because of dask-ml on the release candidate (#3393)
ilan-gold Dec 6, 2024
52e4ee5
[pre-commit.ci] pre-commit autoupdate (#3388)
pre-commit-ci[bot] Dec 10, 2024
ad09329
Remove calls to `.format` (#3325)
flying-sheep Dec 10, 2024
817d972
Constrain all extras for min-deps job (#3367)
flying-sheep Dec 12, 2024
98e241c
Add a “improved documentation” category to enhancement template (#3403)
flying-sheep Dec 16, 2024
b02b1ce
Modify error message if certifi is not installed (#3402)
flying-sheep Dec 17, 2024
120db93
Add replace option to subsample and rename function to sample (#943)
gokceneraslan Dec 19, 2024
a99d365
Switch to session-info2 (#3384)
flying-sheep Dec 19, 2024
6e70c3d
Scipy 1.15 compat, some test refactors (#3409)
flying-sheep Dec 19, 2024
99b9aa0
Deprecate visium (#3407)
flying-sheep Dec 19, 2024
e8cd544
Add sample probabilities (#3410)
flying-sheep Dec 20, 2024
a578d83
(chore): generate 1.11.0 release notes (#3412)
flying-sheep Dec 20, 2024
9f22db2
Note that it’s an rc
flying-sheep Dec 20, 2024
be9ae60
[pre-commit.ci] pre-commit autoupdate (#3404)
pre-commit-ci[bot] Jan 9, 2025
e4dc24b
Fix Markdown syntax for preprocessing docs page (#3418)
dinakazemi Jan 9, 2025
49b5439
Doc fixes for 1.11 (#3415)
flying-sheep Jan 9, 2025
0d750f4
Mention towncrier in contribution docs (#3427)
flying-sheep Jan 9, 2025
765661f
Fix wilcoxon for >10M cells (#3426)
flying-sheep Jan 9, 2025
ca3dd8f
Update author/maintainer metadata (#3413)
flying-sheep Jan 9, 2025
f9894b7
Formatting (#3414)
flying-sheep Jan 10, 2025
723a246
(chore) upper bound zarr version in tests to <3 (#3432)
flying-sheep Jan 10, 2025
30f83d8
Fix flaky doublet test (#3436)
flying-sheep Jan 13, 2025
1c0afee
(chore): Update to Ruff 0.9 and add EM lints (#3437)
flying-sheep Jan 13, 2025
fbb0692
Grammar fixes in `sc.tl` docstrings (#3438)
zm711 Jan 15, 2025
84a2c39
updated for filter_cell
kaushalprasadhial Oct 25, 2024
4901542
print statment removal
kaushalprasadhial Nov 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
12 changes: 9 additions & 3 deletions .azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,16 +15,16 @@ jobs:
vmImage: 'ubuntu-22.04'
strategy:
matrix:
Python3.9:
python.version: '3.9'
Python3.10:
python.version: '3.10'
Python3.12: {}
minimal_dependencies:
TEST_EXTRA: 'test-min'
anndata_dev:
DEPENDENCIES_VERSION: "pre-release"
TEST_TYPE: "coverage"
minimum_versions:
python.version: '3.9'
python.version: '3.10'
DEPENDENCIES_VERSION: "minimum-version"
TEST_TYPE: "coverage"

Expand Down Expand Up @@ -103,6 +103,12 @@ jobs:
testResultsFormat: NUnit
testRunTitle: 'Publish test results for $(Agent.JobName)'

- task: PublishBuildArtifacts@1
inputs:
pathToPublish: '.pytest_cache/d/debug'
artifactName: debug-data
condition: eq(variables['TEST_TYPE'], 'coverage')

- script: bash <(curl -s https://codecov.io/bash)
displayName: 'Upload to codecov.io'
condition: eq(variables['TEST_TYPE'], 'coverage')
Expand Down
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/bug-report.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
name: Bug report
description: Scanpy doesn’t do what it should? Please help us fix it!
#title: ...
type: Bug
labels:
- Bug 🐛
- Triage 🩺
#assignees: []
body:
Expand Down
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/config.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
blank_issues_enabled: true
blank_issues_enabled: false
contact_links:
- name: Scanpy Community Forum
url: https://discourse.scverse.org/
Expand Down
3 changes: 2 additions & 1 deletion .github/ISSUE_TEMPLATE/enhancement-request.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
name: Enhancement request
description: Anything you’d like to see in scanpy?
#title: ...
type: Enhancement
labels:
- Enhancement ✨
- Triage 🩺
#assignees: []
body:
Expand All @@ -14,6 +14,7 @@ body:
- 'Additional function parameters / changed functionality / changed defaults?'
- 'New analysis tool: A simple analysis tool you have been using and are missing in `sc.tools`?'
- 'New plotting function: A kind of plot you would like to seein `sc.pl`?'
- 'Improved documentation or error message?'
- 'Other?'
validations:
required: true
Expand Down
3 changes: 1 addition & 2 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,7 @@ jobs:
key: benchmark-state-${{ hashFiles('benchmarks/**') }}

- name: Install dependencies
# TODO: revert once this PR is merged: https://github.com/airspeed-velocity/asv/pull/1397
run: pip install 'asv @ git+https://github.com/ivirshup/asv@fix-conda-usage'
run: pip install 'asv>=0.6.4'

- name: Configure ASV
working-directory: ${{ env.ASV_DIR }}
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/check-pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,13 @@ jobs:
with:
fetch-depth: 0
filter: blob:none
- name: Find out if relevant release notes are modified
uses: dorny/paths-filter@v2
- name: Find out if a relevant release fragment is added
uses: dorny/paths-filter@v3
id: changes
with:
filters: | # this is intentionally a string
relnotes: 'docs/release-notes/${{ github.event.pull_request.milestone.title }}.md'
- name: Check if relevant release notes are modified
relnotes: 'docs/release-notes/${{ github.event.pull_request.number }}.*.md'
- name: Check if a relevant release fragment is added
uses: flying-sheep/check@v1
with:
success: ${{ steps.changes.outputs.relnotes }}
3 changes: 1 addition & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
/tests/**/*failed-diff.png

# Environment management
/hatch.toml
/Pipfile
/Pipfile.lock
/requirements*.lock
Expand All @@ -29,6 +28,7 @@
# Python build files
__pycache__/
/src/scanpy/_version.py
/ci/scanpy-min-deps.txt
/dist/
/*-env/
/env-*/
Expand All @@ -42,7 +42,6 @@ Thumbs.db

# IDEs and editors
/.idea/
/.vscode/

# asv benchmark files
/benchmarks/.asv
Expand Down
19 changes: 11 additions & 8 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,16 +1,13 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.4.9
rev: v0.9.1
hooks:
- id: ruff
types_or: [python, pyi, jupyter]
args: ["--fix"]
- id: ruff-format
types_or: [python, pyi, jupyter]
# The following can be removed once PLR0917 is out of preview
- name: ruff preview rules
id: ruff
types_or: [python, pyi, jupyter]
args: ["--preview", "--select=PLR0917"]
- repo: https://github.com/flying-sheep/bibfmt
rev: v4.3.0
Expand All @@ -19,8 +16,17 @@ repos:
args:
- --sort-by-bibkey
- --drop=abstract
- repo: https://github.com/biomejs/pre-commit
rev: v0.6.1
hooks:
- id: biome-format
additional_dependencies: ["@biomejs/[email protected]"]
- repo: https://github.com/ComPWA/taplo-pre-commit
rev: v0.9.3
hooks:
- id: taplo-format
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
rev: v5.0.0
hooks:
- id: trailing-whitespace
exclude: tests/_data
Expand All @@ -34,6 +40,3 @@ repos:
- id: detect-private-key
- id: no-commit-to-branch
args: ["--branch=main"]

ci:
autofix_prs: false
10 changes: 9 additions & 1 deletion .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,16 @@ version: 2
submodules:
include: all
build:
os: ubuntu-20.04
os: ubuntu-24.04
tools:
python: '3.12'
jobs:
post_checkout:
# unshallow so version can be derived from tag
- git fetch --unshallow || true
pre_build:
# run towncrier to preview the next version’s release notes
- ( find docs/release-notes -regex '[^.]+[.][^.]+.md' | grep -q . ) && towncrier build --keep || true
sphinx:
fail_on_warning: true # do not change or you will be fired
configuration: docs/conf.py
Expand All @@ -14,4 +21,5 @@ python:
path: .
extra_requirements:
- doc
- dev # for towncrier
- leiden
5 changes: 5 additions & 0 deletions .taplo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[formatting]
array_auto_collapse = false
column_width = 120
compact_arrays = false
indent_string = ' '
26 changes: 26 additions & 0 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{
"version": "0.2.0",
"configurations": [
{
"name": "Python: Build Documentation",
"type": "debugpy",
"request": "launch",
"module": "sphinx",
"args": ["-M", "html", ".", "_build"],
"cwd": "${workspaceFolder}/docs",
"console": "internalConsole",
"justMyCode": false,
},
{
"name": "Python: Debug Test",
"type": "debugpy",
"request": "launch",
"program": "${file}",
"purpose": ["debug-test"],
"console": "internalConsole",
"justMyCode": false,
"env": { "PYTEST_ADDOPTS": "--color=yes" },
"presentation": { "hidden": true },
},
],
}
22 changes: 22 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"[python][toml][json][jsonc]": {
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.organizeImports": "explicit",
"source.fixAll": "explicit",
},
},
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
},
"[toml]": {
"editor.defaultFormatter": "tamasfe.even-better-toml",
},
"[json][jsonc]": {
"editor.defaultFormatter": "biomejs.biome",
},
"python.analysis.typeCheckingMode": "basic",
"python.testing.pytestArgs": ["-vv", "--color=yes"],
"python.testing.pytestEnabled": true,
"python.terminal.activateEnvironment": true,
}
11 changes: 11 additions & 0 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,14 @@ Benchmarks are run using the [benchmark bot][].
[asv]: https://asv.readthedocs.io/
[`benchmark.yml`]: ../.github/workflows/benchmark.yml
[benchmark bot]: https://github.com/apps/scverse-benchmark

## Data processing in benchmarks

Each dataset is processed so it has

- `.layers['counts']` (containing data in C/row-major format) and `.layers['counts-off-axis']` (containing data in FORTRAN/column-major format)
- `.X` and `.layers['off-axis']` with log-transformed data (formats like above)
- a `.var['mt']` boolean column indicating mitochondrial genes

The benchmarks are set up so the `layer` parameter indicates the layer that will be moved into `.X` before the benchmark.
That way, we don’t need to add `layer=layer` everywhere.
5 changes: 3 additions & 2 deletions benchmarks/asv.conf.json
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@

// The Pythons you'd like to test against. If not provided, defaults
// to the current version of Python used to run `asv`.
// "pythons": ["3.9", "3.12"],
// "pythons": ["3.10", "3.12"],

// The list of conda channel names to be searched for benchmark
// dependency packages in the specified order
Expand All @@ -78,13 +78,14 @@
"natsort": [""],
"pandas": [""],
"memory_profiler": [""],
"zarr": [""],
"zarr": ["2.18.4"],
"pytest": [""],
"scanpy": [""],
"python-igraph": [""],
// "psutil": [""]
"pooch": [""],
"scikit-image": [""],
// "scikit-misc": [""],
},

// Combinations of libraries/python versions can be excluded/included
Expand Down
Loading