Bug #11093
closedCSV file import doesn't handle fields with spaces or missing fields that are not applicable
100%
Description
I was given an Excel file to use to produce updates to all the Emergency Operations Centers Icom IC-2100H radios. Importing or opening the csv file from Excel resulted in many channels being blank. I stepped through possible reasons for this. "ERock EOC Icom_IC-2100H memories 20240118.zip" contains my test steps in a txt file, and several versions of the data, in Excel, csv, and img formats. chirp-next-20240111-win64.
Files
Updated by Dan Smith 10 months ago
I'm sure you saw from your debug log, but this is the problem:
ERROR: Line 4: [duplex] ` ' is not in valid list: ['', '+', '-', 'split', 'off']
(repeated for many other memories)
I guess the real bug here is that we should show that to the user. We could maybe do that on import, but we don't currently have a way for a driver (which CSV is, just like any other radio) to load an image (CSV file) and also throw some non-critical errors. Meaning, we can have the radio raise an error and say that the file is un-parse-able, but we can't say "I mostly loaded this for you, but some stuff was broken".
TBH, the actual intolerance of malformed data doesn't seem like a chirp problem to me, because the file (in your example) is explicitly asking for a duplex of " ", which is invalid. We wouldn't accept "X" or "n/a" or "simplex" there so I'm not really sure that being tolerant of a space as a special case makes sense. We can't really strip whitespace from each column because some values need to preserve leading and trailing whitespace (such as the memory name field or the D-STAR callsign fields).
What special behavior do you think we should add here? A special case for duplex=" "? Excel must not be putting the errant space characters, because it leaves other fields alone (such as skip, in your files). I assume this means you entered the space characters into those fields yourself?
Testing indicates that chirp import routine for this radio doesn't handle either empty fields
I assume you are thinking that a field like "offset" should be allowed to be left blank to imply 0MHz? Some might think that the band plan should be applied to take the actual "default".
I guess my problem with letting you leave any field blank means that some people may be confused about why they left a field like "Tone" blank, but chirp actually considers it to have a value (since it has no concept of nothing being in that field - it always has to have a value, even if it's unused, like many radios do).
So, I agree it would be best if we could expose the problems with the file so the user is aware (instead of just silently not loading those memories), but I'm not really convinced we should be substantially more tolerant with the data format. Maybe I'm missing why it's difficult to avoid breaking the format while editing the file outside of chirp?
Updated by Rich Messeder 10 months ago
Dan Smith wrote in #note-1:
I assume this means you entered the space characters into those fields yourself?
Thanks for getting back to me...I offer the following discussion, but without argument, rather as a way of exchanging POVs...
No. I was given the file, so I don't know what editing the author did. I had passed him a file with defaults filled in, and he passed it back...with other corrections...and defaults edited out. His reasoning was that, for example, there was no Tone option selected, there should be no tone value entered. (Aside: you might have seen in one of my earlier posts that some folks think that chirp inserts defaults for PL tone.) The casual spreadsheet user is likely not able to distinguish between a cell that is "blank" and one that is truly empty ("CLEARed", in Excel). But I only mention it, based on my experiences working with users and teaching spreadsheet use, not to address the issue here. The point is that I was handed a file that /looked/ OK to me, as a chirp novice (Yes, I have used it for years, but not studied it to become an expert. I simply wanted to use it for several reasons mentioned in the past, and that are similar to many other users.) I didn't know that chirp would choke on empty fields, and I didn't know that "apparently empty" fields actually had "spaces", and were not "empty".
I assume you are thinking that a field like "offset" should be allowed to be left blank to imply 0MHz? Some might think that the band plan should be applied to take the actual "default".
No. Or yes. I think that if the channel is a simplex channel, then no offset is applicable. Of course, chirp doesn't necessarily know that I want that channel to be simplex, but the lack of an offset is reasonable to me. To get the csv file to pass, I simply changed all the empty offset channels to 0.6, and got on with business. Because the channels are to be used as simplex, it could have been any value, and the end users are not likely to notice or care. But I don't know what the radios need, so I can see using 0.0, also. In fact, I might try this, and if it passes, update my notes.
I guess my problem with letting you leave any field blank means that some people may be confused about why they left a field like "Tone" blank, but chirp actually considers it to have a value (since it has no concept of nothing being in that field - it always has to have a value, even if it's unused, like many radios do).
You know chirp users far better than I.
Bingo. If chirp actually considers it to have a value...I guess on reading from the radio...then it seems reasonable to me that it can default on import in the same way. I also see the ham's POV who gave me the file. If no PL tone is needed, then why insert one? I havna checked IC-2100H radio channels to see if they default to a value if a memory is stored as simplex, and no PL tone is specified, or even if a channel /can/ be stored without specifying tone.
(Aside: I got the original file for the ham I am "working for", by reading the 2100, exporting as a csv, and saving as xls, then sending off.)
So, I agree it would be best if we could expose the problems with the file so the user is aware (instead of just silently not loading those memories), but I'm not really convinced we should be substantially more tolerant with the data format. Maybe I'm missing why it's difficult to avoid breaking the format while editing the file outside of chirp?
Because we (or at least I) don't know what naive users will do when editing a file. My experience is that folks who are familiar with a process or tool often think that it's all common sense, and anyone would, of course, know how to use it correctly. My first programming experience (as an inexperienced blue collar worker trying to help the secretary) was a disaster simply because I didn't imagine that she would not follow my directions. At some point she had good reason not to, and blew up the program. Lesson learned.
Well, I'd have been further along if folks on this list had responded to my direct questions about default values. Ditto, if I could have found something about this in the docs. I saw several dire warnings about using Excel, but they contained little actual fact, only mythology. I have been using Excel for chirp files for nearly a decade. No problems.
I have an understanding now, and have added my notes to the Excel file for future reference. Perhaps the CSV "driver" can export an error to chirp indicating that we should look at the log? Then chirp pops a small dialog telling us there were issues, and perhaps give us a link|chance to open the log. Had I done that, I'd have saved myself several hours of posting on the list and testing what was happening. But in the past, when I have seen references to "debug" log, I assumed that the log would contain errors that I was aware of, such as chirp shutting down by itself. Instead, I now see that it contained useful info that I was not aware had been written.
So...I used to do software engineering. I have literally decades of experience crossing many platforms and languages. All that really means is that I have opinions based on my experiences. When I started out, I started to build on the successes and failures of others, and some of those pointed me in the direction of using the IEEE software engineering standards. I have interpreted those docs, and other software engineering docs, in specific ways that make sense to me and have led to my successes over the years. YMMV.
In my world, the first thing I'd do for offset, for example, is to fill the field with 0.0 (or whatever is reasonably valid), then try to convert the field data to a FP number. If that failed, I'd log it as chirp does, and leave the field with a valid entry. I'm fine if this is not a good design for chirp, especially now that I know to check the "debug" log. I might just make it a habit to regularly check it; it takes so little time.
My confusion stems somewhat from the inconsistent handling. The Duplex field can have '' (empty; that is, in the csv file, comma,comma, with nothing between. My idea of an empty field. Apparently radios can have this field empty. (Note again that I have used chirp often, but think of myself as a naive user.) But Offset can't be "empty". I guess radios fill that field with something when we save a simplex channel. So "inconsistent" here really means "uneducated".
I now know what to do with spreadsheets, so I will make sure that I do that in the future. I filed the bug report because my software engineering expectations are different from the chirp requirements specifications and design specs, but I can understand the points made here. Those of us who have significant development experience traveled different paths. I filed the bug report because that was the suggestion on the list, but I have been a happy camper with chirp for a long time, and all I wanted to do was share a different POV. It may not be worth the effort, given the /lack/ of others expressing the same POV.
In closing, I did feel that I wanted to be fair to Excel, and post my successes there. Admittedly, I don't know how other spreadsheets would have handled the same editing process by the same individual (I don't know /what/ he did, other than decide (for example) that simplex channels don't need an offset.), but we should not be distroing a myth that Excel is unsat. Perhaps the experiences that contribute to this POV are from an earlier version of Excel. I am using 2010.
~R~
Updated by Dan Smith 10 months ago
No. I was given the file, so I don't know what editing the author did. I had passed him a file with defaults filled in, and he passed it back...with other corrections...and defaults edited out. His reasoning was that, for example, there was no Tone option selected, there should be no tone value entered. (Aside: you might have seen in one of my earlier posts that some folks think that chirp inserts defaults for PL tone.) The casual spreadsheet user is likely not able to distinguish between a cell that is "blank" and one that is truly empty ("CLEARed", in Excel). But I only mention it, based on my experiences working with users and teaching spreadsheet use, not to address the issue here. The point is that I was handed a file that /looked/ OK to me, as a chirp novice (Yes, I have used it for years, but not studied it to become an expert. I simply wanted to use it for several reasons mentioned in the past, and that are similar to many other users.) I didn't know that chirp would choke on empty fields, and I didn't know that "apparently empty" fields actually had "spaces", and were not "empty".
Yeah, I totally agree and surely wouldn't expect the average spreadsheet user to realize the difference between empty and a space, nor notice that a space is there. I'd also maybe go so far as to say that someone at that level might not even be able to make sense of an error message that quoted a space character as a problem, which makes it difficult to even expose.
No. Or yes. I think that if the channel is a simplex channel, then no offset is applicable. Of course, chirp doesn't necessarily know that I want that channel to be simplex, but the lack of an offset is reasonable to me. To get the csv file to pass, I simply changed all the empty offset channels to 0.6, and got on with business. Because the channels are to be used as simplex, it could have been any value, and the end users are not likely to notice or care. But I don't know what the radios need, so I can see using 0.0, also. In fact, I might try this, and if it passes, update my notes.
Offset just has to be a number, so 0.0 is fine as well, yeah. When CHIRP is reading a line of CSV, it's populating its internal data structure from the fields (in whichever order you've re-arranged them in the file). That internal data structure is rigidly defined, in that you can't store "" or NULL in a field that must be a number. So saying that we shouldn't care about the offset field if duplex is unset means that we have to have already populated the duplex field and we have to know about the relationship between those fields (which isn't actually a requirement). It's quite legit to want to set the offset on a simplex channel, if you want it to be stored in the radio like that. Doing so for a "talkaround" channel where you communicate on the output of the repeater locally and quickly enable duplex to go through the repeater is not an uncommon use-case for commercial users.
Bingo. If chirp actually considers it to have a value...I guess on reading from the radio...then it seems reasonable to me that it can default on import in the same way. I also see the ham's POV who gave me the file. If no PL tone is needed, then why insert one? I havna checked IC-2100H radio channels to see if they default to a value if a memory is stored as simplex, and no PL tone is specified, or even if a channel /can/ be stored without specifying tone.
Yeah, I think the problem comes down to one of scope. We don't provide CSV format support so you can do all your editing outside of chirp. It's more just that it's a very convenient, human-readable, hardware-neutral format. We used to also support an XML representation of a set of channels as well, which wouldn't have this problem as much, but it was more complex for little benefit.
My point is, I feel like if you're going to edit your CSV files in another tool (be it Excel or Notepad), you have to not break the format. I don't want to get into the business of being overly tolerant of the things in CSV files, to the point of letting people specify frequencies in Hertz and tone modes in whatever their radio calls "tone squelch mode".
Well, I'd have been further along if folks on this list had responded to my direct questions about default values. Ditto, if I could have found something about this in the docs. I saw several dire warnings about using Excel, but they contained little actual fact, only mythology. I have been using Excel for chirp files for nearly a decade. No problems.
Yeah, I think it's because Excel makes it easy to break the format, and is probably mentioned by name because it's the most popular/notable too with which you can break your files :)
I have an understanding now, and have added my notes to the Excel file for future reference. Perhaps the CSV "driver" can export an error to chirp indicating that we should look at the log? Then chirp pops a small dialog telling us there were issues, and perhaps give us a link|chance to open the log. Had I done that, I'd have saved myself several hours of posting on the list and testing what was happening. But in the past, when I have seen references to "debug" log, I assumed that the log would contain errors that I was aware of, such as chirp shutting down by itself. Instead, I now see that it contained useful info that I was not aware had been written.
Well, I certainly don't want to tell users to look at the debug log to see what went wrong. As I mentioned, the interface for loading a radio image (or CSV file in this case) doesn't provide a way to communicate to the UI that it should warn the user that something went wrong. The assumption is that we were able to read the file, or were't and here's the error message.
That said, I was experimenting last night with an arrangement that would let the UI collect warning- or error-level log messages from the driver that would have been only sent to the log, and show them to the user. It works, and exposes the problem with your sample file on the screen, but I'm not sure how I feel about it. The log is really intended for developers and assuming those log messages are useful to (and formatted/translated for) the user is a bit of a leap.
In my world, the first thing I'd do for offset, for example, is to fill the field with 0.0 (or whatever is reasonably valid), then try to convert the field data to a FP number. If that failed, I'd log it as chirp does, and leave the field with a valid entry. I'm fine if this is not a good design for chirp, especially now that I know to check the "debug" log. I might just make it a habit to regularly check it; it takes so little time.
Well, this is what we do (roughly), but we don't just silently ignore the error. I imagine it would be equally maddening to import a line that clearly has a value for offset that you think is valid, but on import, gets "overwritten" with some default for some reason you don't understand. I'd say it's worse in that case, because you might import 1000 memories, and not realize that 20 or so of them didn't get the offset you intended because the default was taken, but the rest of the memory was present. It's a lot more obvious to be missing whole memories than to have one value out of place in a field of a thousand.
That said, of course, the best thing to do (IMHO) is indicate the problem to the user so there's no question. It's just that we don't have any way to do this currently because of the architecture of the driver model, and because CSV is the only driver that really has this problem.
My confusion stems somewhat from the inconsistent handling. The Duplex field can have '' (empty; that is, in the csv file, comma,comma, with nothing between. My idea of an empty field. Apparently radios can have this field empty. (Note again that I have used chirp often, but think of myself as a naive user.) But Offset can't be "empty". I guess radios fill that field with something when we save a simplex channel. So "inconsistent" here really means "uneducated".
Well, we could have named this "simplex" but almost no radios have a "simplex" setting. They either have a duplex or nothing, and that's what we're modeling here.
In closing, I did feel that I wanted to be fair to Excel, and post my successes there. Admittedly, I don't know how other spreadsheets would have handled the same editing process by the same individual (I don't know /what/ he did, other than decide (for example) that simplex channels don't need an offset.), but we should not be distroing a myth that Excel is unsat. Perhaps the experiences that contribute to this POV are from an earlier version of Excel. I am using 2010.
Yep, and if you find any pages on this site admonishing Excel, let me know.
Updated by Dan Smith 9 months ago
- Status changed from New to Closed
- % Done changed from 0 to 100
Applied in changeset github|fe90410c2a5ddc974b04c625de4dc9959a47ffb0.