Ticket #3713 (closed defect: fixed)

Opened 6 years ago

Last modified 6 years ago

Linked Data causes ~25% overhead/DB explosion

Reported by: tdb01r Owned by: cjg
Priority: Must do Milestone: EPrints 3.2.1
Component: RDF Version: 3.2
Severity: normal Keywords:


The rdf field (when populated) adds ~58 values per record, corresponding to ~300 DB rows.

  • any dependent field requires re-writing the entire rdf field (300 rows+)
  • requesting an eprint reads all rows

In testing this adds ~25% overhead on all EPrints retrieval-based operations.

For a medium-sized repository (10k records) the rdf tables will be 500k rows long. This is not scalable for a core field.

Change History

Changed 6 years ago by cjg

The RDF does *not* need to be retrieved when doing $dataset->dataobj(xxx) etc.

Would it make more sense to store the RDF in a stand alone table (or dataset?)

Changed 6 years ago by tdb01r

  • owner set to cjg
  • component changed from - to RDF

Changed 6 years ago by cjg

  • status changed from new to closed
  • resolution set to fixed

Fixed in r5090. Now stored in a separate dataset.

Note: See TracTickets for help on using tickets.