The most recent addition to wallstreetlocal was the ability to query XML files along with HTML files. The only format remaining to code in now, is plain text (TXT).
The SEC's XML and HTML stocks were barely structured enough to be queried accurately, but TXT provides an even harder challenge. The problem is the inconsistency. While tables in TXT can be read fairly easily by human eyes, they are too disimilar to query effectively.
Here are some minified examples.
<TABLE> <C> <C>
FO RM 13F IFORMATIONTABLE
VALUE SHARES/ SH/ PUT/ INVSTMT OTHER VOT ING AUTRITY
NAME OF ISSUER TITLE OF CLASS CUSIP (X1000)PRN AMT PRN CALL DSCRETN MANAGERSOLE SHARED NONE
------------------------------------------- ------------------ ---- ------- -----------------------------
AFLAC INC COMMON STOCK 001055102 3 71SH DEFINED 71 0 0
AGL RESOURCES INC COMMON STOCK 001204106 123 3025SH DEFINED 3025 0 0
ABBOTT LABS COM COMMON STOCK 002824100 1606 30519SH DEFINED 27798 2721 0
ABERCROMBIE & FITCHCOMMON STOCK 002896207 0 2SH DEFINED 2 0 0
AIR PRODUCTS & CHEMCOMMON STOCK 009158106 16728 175017SH DEFINED 140030 2282 32705
AIRGAS INC COMMON STOCK 009363102 4 52SH DEFINED 52 0 0
</TABLE>
<TABLE> <C> <C> <C> <C> <C> <C> <C>
VALUE INV. OTH vtng
NAME OF ISSUER CLASS CUSIP (x$1000) SHARES disc MGRS AUTH
Albertson College of Idaho Large Growth
ADC TELECOMMUNICATIO COMM 000886101 $18 337.00 Sole N/A Sole
AFLAC INC COMM 001055102 $14 298.00 Sole N/A Sole
AES CORP COMM 00130H105 $18 232.00 Sole N/A Sole
AXA FINL INC COMM 002451102 $18 504.00 Sole N/A Sole
ABBOTT LABS COMM 002824100 $61 1,724.00 Sole N/A Sole
ABERCROMBIE & FITCH COMM 002896207 $2 114.00 Sole N/A Sole
</TABLE>
<TABLE>
VALUE SHARES/ SH/ PUT/ INVSTMT -----VOTING AUTHORITY-----
NAME OF ISSUER -TITLE OF CLASS- --CUSIP-- (X$1000) PRN AMT PRN CALL DSCRETN -MANAGERS- SOLE SHARED NONE
<C> <C>
D DAIMLERCHRYSLER AG ORD D1668R123 5 112 SH DEFINED 05 112 0 0
D DAIMLERCHRYSLER AG ORD D1668R123 31 748 SH DEFINED 05 748 0 0
D DAIMLERCHRYSLER AG ORD D1668R123 5 130 SH DEFINED 06 130 0 0
D DAIMLERCHRYSLER AG ORD D1668R123 246 5894 SH DEFINED 14 3089 0 2805
D DAIMLERCHRYSLER AG ORD D1668R123 118 2832 SH DEFINED 14 2104 604 124
D DAIMLERCHRYSLER AG ORD D1668R123 8 200 SH DEFINED 29 200 0 0
D DAIMLERCHRYSLER AG ORD D1668R123 63 1510 SH DEFINED 41 0 0 1510
</TABLE>
The column sizes, names, and overall formatting of each table changes too often for any meanginful code to be written. Without writing a gargantuan amount of code, or using AI (which is expensive), there doesn't seem to be much way to query stocks like this.
There should be a better, more effective method to taking the TXT tables, and creating usable, structured data.
The most recent addition to wallstreetlocal was the ability to query XML files along with HTML files. The only format remaining to code in now, is plain text (TXT).
The SEC's XML and HTML stocks were barely structured enough to be queried accurately, but TXT provides an even harder challenge. The problem is the inconsistency. While tables in TXT can be read fairly easily by human eyes, they are too disimilar to query effectively.
Here are some minified examples.
The column sizes, names, and overall formatting of each table changes too often for any meanginful code to be written. Without writing a gargantuan amount of code, or using AI (which is expensive), there doesn't seem to be much way to query stocks like this.
There should be a better, more effective method to taking the TXT tables, and creating usable, structured data.