-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(URGENT) Cannot fetch from "W3C PR-math-19980224" #38
Comments
I've also tried these:
|
@ronaldtse there isn't <REC rdf:about="https://www.w3.org/TR/1998/REC-MathML-19980407/">
<dc:date>1998-04-07</dc:date>
<dc:title>Mathematical Markup Language (MathML) 1.0 Specification</dc:title>
<doc:obsoletes rdf:resource="https://www.w3.org/TR/1998/PR-math-19980224"/>
<doc:versionOf rdf:resource="https://www.w3.org/TR/REC-MathML/"/>
<editor rdf:parseType="Resource">
<contact:fullName>Patrick D F Ion</contact:fullName>
</editor>
<editor rdf:parseType="Resource">
<contact:fullName>Robert R Miner</contact:fullName>
</editor>
<org:deliveredBy rdf:parseType="Resource">
<contact:homePage rdf:resource="https://www.w3.org/Math/"/>
</org:deliveredBy>
<mat:hasErrata rdf:resource="https://www.w3.org/MarkUp/mathml101-updates/errata.html"/>
</REC> |
Yes we need all documents included obsoleted ones in the dataset. Being obsolete means others still cite it. |
Then we need to scrape documents missed in the tr.rdf from www.w3.org website. We don't know all the missed documents. We can check if a relation is missed in tr.rdf. If it is, get it from www.w3.org. |
@ronaldtse I see what happend. The most recent UPD found the issue with the link to archives. |
@andrew2net yes you are correct. That's likely the only way we can get the full archive. |
@ronaldtse wouldn't it be better to scrape all W3C documents from https://www.w3.org/TR/? In history, we can find obsoleted docs |
@andrew2net then let's scrape that, but I don't think the details are as complete as the RDF file. So we have to combine? |
@ronaldtse you are right. RDF data has more details. We need to do the following:
Once we do it we will have as much as we can from the archives. The |
@ronaldtse I fetched all the |
@ronaldtse I'm trying to get all obsoleted relation that missed in the RDF archive from w3c.org website but the pages have inconsistent layout. I adapted scraper to get data from some layouts but it looks like there are much more of them. And some layouts conflict with each other. I don't know how much time will it take to cover all the variants, and want to clarify if I should continue to invest time to this problem. |
I'm trying to auto-fetch the reference of this document: https://www.w3.org/TR/1998/PR-math-19980224/,
with no success:
The text was updated successfully, but these errors were encountered: