You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One can set the CSV separator (delimiter) and a regular expression to match lines (line_filter).
My CSV is separated with ";" so I set delimiter accordingly.
In order to match against the content of a specific column I constructed an regex like: ^[^;]*;[^;]*;[^;]*;[^;]*;[^;]*donation#i
to match "donation" in column 5.
([^;]*; matches non-semicolon characters followed by a semicolon, so "donation" is searched in column 6)
Checking directly in the CSV the reg ex found the desired lines.
But the import did not work.
Digging around I noticed that CRM_Banking_PluginImpl_Importer_CSV::import_file() ( CRM/Banking/PluginImpl/Importer/CSV.php ) splits each row into an array $line using $config->delimiter.
If a line_filter is set:
array $line is joined again using hard wired ',': $full_line = trim(implode(',', $line));
and this $full_line is matched against the reg ex
I think we should change this to use the given delimiter:
--- ./org.project60.banking/CRM/Banking/PluginImpl/Importer/CSV.php.ORG 2024-12-10 16:34:33.831015379 +0100+++ ./org.project60.banking/CRM/Banking/PluginImpl/Importer/CSV.php 2025-01-24 00:37:12.241468997 +0100@@ -163,7 +163,7 @@
// check if we want to skip line (by filter)
if (!empty($config->line_filter)) {
- $full_line = trim(implode(',', $line));+ $full_line = trim(implode($config->delimiter, $line));
if (!preg_match($config->line_filter, $full_line)) {
$config->header += 1; // bump line numbers if filtered out
continue;
I also tried to change my reg ex using commas instead of semicolons.
Since some of my data columns contain commas I got some unwanted matches - so I think using the configured delimiter is the best option.
The only problem I see with this proposal is that it could change the behavior of existing import configs (in case somebody worked around it and used commas in their line_filter).
Therefor I propose to add an additional config setting use_comma_for_line_filter_delimiter defaulting to TRUE (maybe legacy_use_comma_for_line_filter_delimiter is a better name).
Any thoughts?
I could write an MR for this.
The text was updated successfully, but these errors were encountered:
gngn23
pushed a commit
to gngn23/org.project60.banking
that referenced
this issue
Jan 27, 2025
One can set the CSV separator (
delimiter
) and a regular expression to match lines (line_filter
).My CSV is separated with ";" so I set
delimiter
accordingly.In order to match against the content of a specific column I constructed an regex like:
^[^;]*;[^;]*;[^;]*;[^;]*;[^;]*donation#i
to match "donation" in column 5.
(
[^;]*;
matches non-semicolon characters followed by a semicolon, so "donation" is searched in column 6)Checking directly in the CSV the reg ex found the desired lines.
But the import did not work.
Digging around I noticed that
CRM_Banking_PluginImpl_Importer_CSV::import_file()
(CRM/Banking/PluginImpl/Importer/CSV.php
) splits each row into an array$line
using$config->delimiter
.If a
line_filter
is set:array
$line
is joined again using hard wired','
:$full_line = trim(implode(',', $line));
and this
$full_line
is matched against the reg exI think we should change this to use the given
delimiter
:I also tried to change my reg ex using commas instead of semicolons.
Since some of my data columns contain commas I got some unwanted matches - so I think using the configured delimiter is the best option.
The only problem I see with this proposal is that it could change the behavior of existing import configs (in case somebody worked around it and used commas in their
line_filter
).Therefor I propose to add an additional config setting
use_comma_for_line_filter_delimiter
defaulting to TRUE (maybelegacy_use_comma_for_line_filter_delimiter
is a better name).Any thoughts?
I could write an MR for this.
The text was updated successfully, but these errors were encountered: