Hi All,
When processing a VLF in VO, I find that number of lines output is less than number of lines input.
I wrote some code below to read a file line by line and return the line count.
FUNCTION Start()
LOCAL cInputFile AS STRING
LOCAL nInputLineCount AS DWORD
cInputFile := K_FILE_PATH
? cInputFile
IF File(cInputFile)
? "Will count lines ..."
WAIT
nInputLineCount := GetLineCount(cInputFile)
? "Lines=", nInputLineCount
ELSE
? cInputFile, "File not found"
ENDIF
WAIT
RETURN NIL
FUNCTION GetLineCount(cFile AS STRING) AS DWORD PASCAL
// assume file exists
LOCAL pFile AS PTR
LOCAL nCount AS DWORD
pFile := FOpen(cFile, FO_READ)
IF pFile == F_ERROR
? DosErrString(FError())
ENDIF
DO WHILE ! FEof(pFile)
FGetS2(pFile, 1024)
++nCount
ENDDO
FClose(pFile)
RETURN nCount
DEFINE K_FILE_PATH := "EPD_202203.csv"
The input file has size of 6,522,309,040 and 17,938,549 lines.
This code run in VO gives 11,816,979 lines!
This code run in X# gives the correct answer 17,938,549 lines.
Can anyone pls explain why VO will not read the entire file? Is it a bug in VO runtime or a limit in the WIN32 API functions?
You can find the actual data here if you want to test. Make sure you d/load the ZIP format!
https://opendata.nhsbsa.net/dataset/eng ... 4540a962fd
This post is linked to my other post on the Macro compiler. If I can solve one of the problems I can forget the other
Don
Reading Very Large files
Reading Very Large files
Two suggestions:
1. If this code has worked before, then it's probably the input file. Try saving the file with an editor that enforces DOS line terminators: CHR(13) + CHR(10).
2. Use FReadLine() instead of FGet(). The instructions say there's no difference but there might be.
1. If this code has worked before, then it's probably the input file. Try saving the file with an editor that enforces DOS line terminators: CHR(13) + CHR(10).
2. Use FReadLine() instead of FGet(). The instructions say there's no difference but there might be.
Joe Curran
Ohio USA
Ohio USA
Reading Very Large files
Thanks Joe.
This file has standard line terminators.
Problem only happens with VLF's, > 11m lines?
Think I already tried FReadline ... will check anyway.
You can check this code on any file - just change the K_FILE_PATH value to point to your data.
Don
This file has standard line terminators.
Problem only happens with VLF's, > 11m lines?
Think I already tried FReadline ... will check anyway.
You can check this code on any file - just change the K_FILE_PATH value to point to your data.
Don
Reading Very Large files
Don,
Are there lines in the file with line length > 1024?
Robert
Are there lines in the file with line length > 1024?
Robert
XSharp Development Team
The Netherlands
robert@xsharp.eu
The Netherlands
robert@xsharp.eu
Reading Very Large files
Thanks for your reply Robert.
Line lengths variable (csv) but seem to be < 512, although dunno if there's a longy hidden somewhere . Difficult to look thru a file of 6gb ...
However the same code run in X# gives the correct answer of 17,938,549 lines.
Don
Line lengths variable (csv) but seem to be < 512, although dunno if there's a longy hidden somewhere . Difficult to look thru a file of 6gb ...
However the same code run in X# gives the correct answer of 17,938,549 lines.
Don
Reading Very Large files
Don,
It's very easy to check that with X#. Just use System.IO.File.ReadAllLines() in a small test app and then check the length of each line returned in the array.
Of course you'll need to have enough memory in your system for this simple way to work! And compile in AnyCPU/x64 mode...
.
It's very easy to check that with X#. Just use System.IO.File.ReadAllLines() in a small test app and then check the length of each line returned in the array.
Of course you'll need to have enough memory in your system for this simple way to work! And compile in AnyCPU/x64 mode...
.
Chris Pyrgas
XSharp Development Team
chris(at)xsharp.eu
XSharp Development Team
chris(at)xsharp.eu
Reading Very Large files
Thanks a lot for your reply Chris.
Well I didn't want to read the file into memory as there is a hell of a lot of it!
But I would like to know why, reading line by line, I was unable to get past 11m lines with VO. Is the blockage in the VO runtime or the underlying WIN32 API functions?
Anyway, it's not really important now as Robert helped fix my macro problem, so I have successfully processed all 17m lines in X# - yess. It took about 30 mins (9 yr old pc)
Thanks all for rapid response -
Don
Well I didn't want to read the file into memory as there is a hell of a lot of it!
But I would like to know why, reading line by line, I was unable to get past 11m lines with VO. Is the blockage in the VO runtime or the underlying WIN32 API functions?
Anyway, it's not really important now as Robert helped fix my macro problem, so I have successfully processed all 17m lines in X# - yess. It took about 30 mins (9 yr old pc)
Thanks all for rapid response -
Don
Reading Very Large files
Hi Don,
I didn't mean to do this in your real app! I only suggested to do it in a small 10 line test app, just to find out if your file contains large lines.
But it's not important anymore, only maybe if you wanted to do it out of curiosity.
.
I didn't mean to do this in your real app! I only suggested to do it in a small 10 line test app, just to find out if your file contains large lines.
But it's not important anymore, only maybe if you wanted to do it out of curiosity.
.
Chris Pyrgas
XSharp Development Team
chris(at)xsharp.eu
XSharp Development Team
chris(at)xsharp.eu
- ArneOrtlinghaus
- Posts: 412
- Joined: Tue Nov 10, 2015 7:48 am
- Location: Italy
Reading Very Large files
The problem in VO is perhaps related to the size of the file. More than 4 GB needs larger address pointers than a DWORD. I remember the old and famous PKZIP that had also limits with the file size.
Arne
Arne
Reading Very Large files
Hi Arne,
In VO, you would not load the whole file in memory, instead you would read line by line. And also in X#, for such huge files in real
conditions you would normally do the same thing, but since this was only about a very small and quick test, I suggested doing it
this crude way, in just 10 lines of code. Doing it properly and reading line by line and avoiding using a lot of memory would instead
need 15 lines of code
.
.
In VO, you would not load the whole file in memory, instead you would read line by line. And also in X#, for such huge files in real
conditions you would normally do the same thing, but since this was only about a very small and quick test, I suggested doing it
this crude way, in just 10 lines of code. Doing it properly and reading line by line and avoiding using a lot of memory would instead
need 15 lines of code
.
.
Chris Pyrgas
XSharp Development Team
chris(at)xsharp.eu
XSharp Development Team
chris(at)xsharp.eu