Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
351 commits
Select commit Hold shift + click to select a range
fd67a5d
Update _README.md
artourkin Feb 11, 2014
579c7d8
Update _README.md
artourkin Feb 11, 2014
d026c08
README.md update
artourkin Feb 25, 2014
b45d20f
proper rules edition
artourkin Mar 11, 2014
05f0a0a
deadlock solving. Not finished.
artourkin Mar 20, 2014
ee5c796
proper rules edition
artourkin Mar 11, 2014
5e33295
The deadlock has been solved
artourkin Mar 21, 2014
ac7e615
Merge branch 'origin/deadlock'
artourkin Mar 24, 2014
e9f85fb
Rules clean-up
artourkin Mar 24, 2014
d87577c
Merge branch 'integration'
artourkin Mar 24, 2014
0829ca1
support for purl.dp.org vocabulary in xml profile added
artourkin Mar 24, 2014
10a356b
Shorter source ID length.
artourkin Mar 24, 2014
85d1a63
Merge branch 'origin/vocabulary'
artourkin Mar 25, 2014
0f722d6
Small bug fixes:
artourkin Mar 25, 2014
834b940
extracted property set restored
artourkin Mar 26, 2014
c761a30
Merge remote-tracking branch 'refs/remotes/origin/bugfixing'
artourkin Apr 1, 2014
27c9d00
adjusts readme for micro-site generation
kraxner Apr 24, 2014
c9f3eaa
added MORE.md
kraxner Apr 24, 2014
95dbbdb
Merge pull request #2 from kraxner/master
carlwilson Apr 24, 2014
65e3909
Corrected typo.
carlwilson May 12, 2014
7abbc1a
Corrected typos.
carlwilson May 12, 2014
353d853
Corrected typos & changed Acknowledgements.
carlwilson May 12, 2014
098610a
Corrected typo.
carlwilson May 12, 2014
b05a8f8
Added information to more section of ABOUT.md.
carlwilson May 12, 2014
9e4b503
vocabulary updated
artourkin Jun 3, 2014
eb44896
basic vagrant setup
Jul 14, 2014
39edc2e
Merge pull request #3 from lecs/vagrant
carlwilson Jul 16, 2014
c540ef3
MongoFilterSerializer bug fixed
artourkin Mar 5, 2015
09bb3ba
Packaging algorithm class created
artourkin Mar 6, 2015
0b52c79
sample fits files added
artourkin Mar 23, 2015
ae7dca6
samplling unittest setup and teardown ready
artourkin Mar 24, 2015
45e6bdd
samplling unittest is fully working
artourkin Mar 25, 2015
91b5e33
samplling unittest is fully working
artourkin Mar 30, 2015
bdcc258
c3po-core-0.3.0.jar added to web-app/lib/
artourkin Mar 31, 2015
a1084a9
Property set templates are added to the web-app
artourkin Aug 19, 2015
ff50825
Added numeric graphs support to templates
artourkin Aug 26, 2015
5364663
adding xml config
artourkin Aug 31, 2015
75bc709
adding xml config
artourkin Sep 2, 2015
3d5e7e4
the test added
artourkin Sep 7, 2015
f8563d6
configuration test added
artourkin Sep 7, 2015
42afc15
added content type identification post processing rule
artourkin Sep 8, 2015
93cecf4
content type identification rule added
artourkin Sep 9, 2015
4760288
updated c3po-web
artourkin Sep 9, 2015
88f441c
templating config updated
artourkin Sep 14, 2015
c1d5f63
Added "Unknown" to content type identification rule
artourkin Sep 14, 2015
ee5992e
small bug fixes
artourkin Sep 14, 2015
1cd4d68
added path to rules to config
artourkin Sep 14, 2015
617008d
bug fix
artourkin Sep 14, 2015
c1eb1e4
added template config file
artourkin Feb 24, 2016
78c5c62
added exporting/importing templates to web-app
artourkin Feb 25, 2016
a5b3cba
moved templates configuration to Export tab
artourkin Feb 26, 2016
fb357da
template config updated
artourkin Feb 29, 2016
4d7e3c6
Added deconflictCommand to CLI
artourkin Mar 5, 2016
5a13450
added conflict reduction to c3po-cmd
artourkin Mar 7, 2016
3f2fa1f
updated template config
artourkin Mar 7, 2016
d9c50c1
added info about currently activated template to overview
artourkin Mar 7, 2016
38d4185
fixed a bug when identifying an unknown content type
artourkin Mar 7, 2016
1886ea7
fixed a bug with content type identification
artourkin Mar 8, 2016
a0833ce
fixed bug with templates
artourkin Mar 10, 2016
1a94f30
fixed bug in templates
artourkin Mar 10, 2016
93bf7fc
Major refactoring of c3po-web. Dropped legacy c3po-core and c3po-api …
artourkin Mar 15, 2016
f9290a3
first stablre version after refactoring
artourkin Mar 16, 2016
4db63fb
first stablre version after refactoring
artourkin Mar 16, 2016
618d1bc
migrated play from 2.0.4 to 2.1.0
artourkin Mar 17, 2016
5ab41c6
migrated play from 2.1.0 to 2.2.0
artourkin Mar 17, 2016
a4cce29
migrated play from 2.2.0 to 2.3.0
artourkin Mar 17, 2016
702e96d
fixed bugs when selecting unknown and conflicted values of properties
artourkin Mar 21, 2016
5bd582d
added support for numerical histograms
artourkin Mar 23, 2016
db01b62
fixed bugs with statistics calculation
artourkin Mar 23, 2016
9186e6a
fixed a bug in overview view
artourkin Mar 23, 2016
2688114
enabling caching of histogram results
artourkin Mar 23, 2016
18e7fa0
returned a previous statistics calculation method
artourkin Mar 23, 2016
3d9b76a
fixed templates config
artourkin Mar 24, 2016
ec545b5
Added source toolnames to elements details. This closes #10.
artourkin Mar 25, 2016
b0bcfd6
Code cleanup.This commit fixes issue #8 on default templates.
artourkin Mar 26, 2016
b79133e
small fixes
artourkin Mar 28, 2016
5785be5
Added UI for conflict resolution.
artourkin Mar 30, 2016
def71c8
Conflict resolution processor is created.
artourkin Mar 31, 2016
f81f319
fixed bugs with map-reduce queries during conflict resolution
artourkin Apr 1, 2016
cf05a9b
added all UI components
artourkin Apr 5, 2016
5b8b223
first working version of conflict resolution UI
artourkin Apr 6, 2016
a9929d0
Update README.md
artourkin Apr 7, 2016
62831a5
minor fixes
artourkin Apr 8, 2016
5473c31
this commit fixes issues #17 and #14
artourkin Apr 8, 2016
ed0aef1
This closes #10.
artourkin Apr 8, 2016
69ac876
This closes issue #24.
artourkin Apr 8, 2016
2247f4c
Code refactoring.
artourkin Apr 10, 2016
d755b81
Update README.md
artourkin Apr 11, 2016
c7e001a
This closes issue #20.
artourkin Apr 11, 2016
b0bb25a
This commit is dedicated to the issue #13.
artourkin Apr 12, 2016
4c5a1c3
Removed the background image and the Delete button from the Conflicts…
artourkin Apr 12, 2016
730ee0f
A method to calculate conflict resolution stats added.
artourkin Apr 14, 2016
6cfae4c
Merge branch 'master' into conflictResolutionUI
artourkin Apr 15, 2016
33140c1
This addresses issue #17.
artourkin Apr 15, 2016
936cd4b
This commit addresses issue #16.
artourkin Apr 15, 2016
8248a6b
This commit addresses issue#8. Now, the list of properties for overvi…
artourkin Apr 15, 2016
95ae6b2
Increasing c3po-web version to 0.4.
artourkin Apr 19, 2016
f027e99
updated description and links to the wiki
artourkin Apr 19, 2016
873e971
Removed play2war plugin dependancy. This fixes issue #39.
artourkin Apr 20, 2016
59ecdb2
Restructured the conflict overview table.
artourkin Apr 20, 2016
2046a82
Added conflict overview table.
artourkin Apr 21, 2016
f670f49
Changed the lable 'Rest' to 'Other' in histograms. This addresses iss…
artourkin Apr 22, 2016
eb62568
Handle exception during statistics calculation. Updated the text desc…
artourkin Apr 25, 2016
d5d7b36
Added additional check against null collections
artourkin Apr 26, 2016
d636522
Updated the dependencies
artourkin Apr 26, 2016
4d0a987
Fixed the issue #43 regarding statistics calculation. The results of …
artourkin Apr 27, 2016
7b93498
Replaced maven-copy-to dependency. Now it should be possible to compi…
artourkin May 3, 2016
23d4b0f
added a knockoutJS based table to the object view.
artourkin May 9, 2016
256643b
Added the table for the object view.
artourkin May 13, 2016
92b5dfc
Added a feature to render filters with conflictling conditions
artourkin May 20, 2016
6a0055e
added feature to remove conflicted filtering conditions. This commit …
artourkin May 25, 2016
7c94826
Merge branch 'conflictResolutionUI'
artourkin Jun 1, 2016
ca1123a
Fixed the issue #48. We needed to check a case, when a minimum value …
artourkin Jun 9, 2016
8715d72
Added properties to the default template.
artourkin Jun 9, 2016
a922fc1
unifying map-reduce jobs
artourkin Jun 9, 2016
2da7c9e
merged map-reduce queries.
artourkin Jun 10, 2016
4f54548
Fixed a bug with template loader. Now the list of properties is recal…
artourkin Jun 10, 2016
b61bafe
added 'Reset' button to restore the templates in the overview, added …
artourkin Jun 14, 2016
b0544ec
Resolved an issue when updating records on sharded mongodb
artourkin Jun 15, 2016
c89e908
Fixed a bug with generation the conflict overview table. Previosly, n…
artourkin Jun 16, 2016
91207f7
added a rule description textarea to the object details view.
artourkin Jun 16, 2016
24deef3
Added rule description column to the conflicts tab.
artourkin Jun 17, 2016
4a65ab7
The systematic sampling algorithm used ObjectID to refer to samples, …
artourkin Jun 17, 2016
d9de23f
Added a template for conflicts.
artourkin Jun 17, 2016
bfebb68
Added a conflicts count to stats section in overview
artourkin Jun 17, 2016
75de232
Added conflicts count to statistics section. This fixes issue#52.
artourkin Jun 17, 2016
76f363b
Added a cache to store filter values. This resolves issues #53 and #31.
artourkin Jun 22, 2016
e068922
Fixed the issue #54, which prevented setting filter values to 'Unknown'.
artourkin Jun 27, 2016
eef815f
Added '11' as a default value for sample size.
artourkin Jun 27, 2016
e5156a4
Added remove button to the conflicts tab. The rules are serialized to…
artourkin Jul 4, 2016
d60973a
Conflict resolution rules are loaded before adding new rules. This pr…
artourkin Jul 4, 2016
99968d5
Fixing a bug with sys sampling. Sometimes the algorithm doesn't stop.
artourkin Jul 5, 2016
ffe8a47
Fixing a bug with sys sampling. Sometimes the algorithm doesn't stop.
artourkin Jul 5, 2016
94b4c7d
added a FITS adaptor test. This commit addresses issue#57
artourkin Jul 22, 2016
3f157b4
This commit fixes issues #58 and #59. Digester is replaced with DOM-p…
artourkin Aug 16, 2016
966af97
All tests are OK.
artourkin Aug 16, 2016
cc696c7
Bug fixing web-app after changes in the core.
artourkin Aug 16, 2016
04ae1c2
Added Selective Feature Distribution Sampling Algorithm.
artourkin Aug 17, 2016
704f8f1
adding output reports for sfd sampling.
artourkin Aug 18, 2016
9e7c25e
adding outputs for sfd.
artourkin Aug 20, 2016
dc09862
added results generation for sfd.
artourkin Aug 20, 2016
3379610
Debugging map-reduce queries.
artourkin Aug 27, 2016
4c6fa36
fixing issues with sfd
artourkin Aug 29, 2016
29e791f
disabled tests
artourkin Aug 29, 2016
ada01b9
Added a get route to export the histogram values
artourkin Sep 1, 2016
6510634
fixed the bug about DATE properties.
artourkin Sep 6, 2016
0759efa
fixed an issue with printing the conflict overview table
artourkin Sep 7, 2016
311dc43
fixed a bug about using FITS property mapping.
artourkin Sep 7, 2016
3b4f75b
fixing an issue in systematic sampling
artourkin Sep 22, 2016
97cc8d8
Reverted serilization of sources. Now elements collection in mongoDB …
artourkin Sep 27, 2016
93559a6
fixing bugs
artourkin Sep 27, 2016
4eaaadf
fixed bug occured during syst. sampling, which would result in infini…
artourkin Nov 9, 2016
b696331
updated the table style in object view
artourkin Nov 17, 2016
d34b3f9
The element overview table is now resizable.
artourkin Nov 20, 2016
24beafe
fixed a bug caused by conflict resolution processor
artourkin Nov 22, 2016
0253d06
added a new rule to extract year of document creation.
artourkin Nov 22, 2016
3cf215f
added feature to show long tail instead of other. This fixes issue#60.
artourkin Nov 22, 2016
eaa01ac
quick fix
artourkin Nov 23, 2016
614db97
fixed a bug occuring when sfd sampling applied on conflicted filter
artourkin Nov 23, 2016
d7bb63c
extended the csv export schema
artourkin Nov 27, 2016
7ac6ccf
fixed css container
artourkin Nov 27, 2016
370cad5
extended SFD sampling algorithm. The no proportion produces evenly di…
artourkin Nov 30, 2016
b2d2907
small fix
artourkin Dec 1, 2016
178df39
added a link to download a csv with conflict overview
artourkin Dec 4, 2016
b5f80a8
fixing a bug occured during SFD sampling. Map reduce queries should n…
artourkin Dec 6, 2016
9fa8aa7
adding new filtering classes.
artourkin Dec 9, 2016
2a02cf2
fixing tests
artourkin Dec 11, 2016
b6bb037
replacing mapreduce with aggregation
artourkin Dec 13, 2016
7ebe7ed
added countConflicts
artourkin Dec 14, 2016
e053081
first successful c3po web run
artourkin Dec 15, 2016
9b4df3c
updated the map-reduce queries to comply with the latest data model
artourkin Dec 20, 2016
6a36b84
updated mapreduce queries to calculate statistics
artourkin Dec 20, 2016
a76e4b2
refactoring the filter class
artourkin Dec 26, 2016
c074b1b
extenting web interface to support new filtering
artourkin Jan 2, 2017
f58a9a3
adding filters to the web app
artourkin Jan 9, 2017
0f8d65a
the new version of filtering is actual
artourkin Jan 26, 2017
14dcdf5
fixed bugs with filtering. now it working correctly
artourkin Jan 27, 2017
751c7cb
fixed an issue with element overview page.
artourkin Jan 27, 2017
6f5c932
fixed an issue occuring during generation of the conflict overview table
artourkin Jan 30, 2017
cea46d4
fixed an issue when extracting status from fits
artourkin Jan 30, 2017
bfb4d10
fixed incorrect stats caclulation
artourkin Jan 31, 2017
3aa2704
fixed bugs with sampling and exporting
artourkin Feb 1, 2017
e6c559f
fixed the issue with sampling
artourkin Feb 1, 2017
b49aa81
updated maven dependancy
artourkin Feb 1, 2017
6ff42e2
changed the default strict parameter of filter to false
artourkin Feb 7, 2017
cbf5d91
fixed a bug with sfd
artourkin Feb 9, 2017
96780de
added update methods for conflict resolution
artourkin Feb 27, 2017
d8ff54b
added update methods for conflict resolution
artourkin Mar 2, 2017
84d78d0
updated profileGenerator
artourkin Mar 27, 2017
c57c234
fixed a bug when resolving conflicts
artourkin Mar 30, 2017
2001636
fixed a bug when resolving conflicts
artourkin Apr 5, 2017
f8d1cb8
added a possibility to connect to Mongodb via URI. Now it is possible…
artourkin Jun 4, 2017
d7b26a5
fixed a bug in mongodb connector
artourkin Jun 22, 2017
5fe3350
fixed a bug in mongodb connector
artourkin Jun 22, 2017
a1ac707
added tuples export to csv in sfd sampling.
artourkin Jul 23, 2017
56b4ae9
removed the libs due to pull request
Aug 3, 2017
feea460
Merge branch 'dev'
artourkin Aug 3, 2017
b1fc500
moving to sbt
artourkin Aug 3, 2017
7c1a6e3
added sbt files
artourkin Aug 4, 2017
c51b1cf
removed poms
artourkin Aug 5, 2017
650d99f
added travis config file
artourkin Aug 5, 2017
e73bf6f
removed drools dependency, updated play to v 2.4.0
artourkin Aug 6, 2017
dc461b8
extending conflict overview
artourkin Aug 14, 2017
f089748
Added conflict overview table.
artourkin Aug 28, 2017
f4fd060
Removed timeout for ajax-requests.
artourkin Aug 28, 2017
e3d8456
added conflicts csv export.
artourkin Aug 31, 2017
33e996e
disabled filter strict mode.
artourkin Sep 3, 2017
67bb61f
Added property strictness to UI.
artourkin Sep 4, 2017
4647436
Extending cache to store property values.
artourkin Sep 4, 2017
ac0ffb9
fixed rule creation.
artourkin Sep 13, 2017
51bdaf3
fixed conflict rule table printing
artourkin Sep 17, 2017
b6000f5
fixed conflict overview table printing
artourkin Sep 19, 2017
d9f8219
added dockerfile
Sep 26, 2017
2b7889f
Fixed consolidator termination condition
Sep 27, 2017
a4b4e2f
updated Dockerfile
Sep 27, 2017
361f094
corrected a log message.
artourkin Sep 27, 2017
ed65126
changed the scope of conflict resolution rules to current filter.
artourkin Sep 27, 2017
4069fae
reverted changes when creating a new conflict resolution rule.
artourkin Oct 1, 2017
f56602b
now each document contains all possible metadata properties.
artourkin Oct 2, 2017
2418d44
fixed mapreduce queries according to the new datamodel
artourkin Oct 8, 2017
4c9a912
updated c3po template config file
artourkin Oct 8, 2017
a9e7c2e
updated the xml export
artourkin Oct 8, 2017
d0733cb
updated deserializer to support empty sourcedValues
artourkin Oct 24, 2017
d97ceac
added null pointer checks
artourkin Nov 19, 2017
9a29636
fixed configuration, updated c3po to version 0.6
artourkin Apr 27, 2018
f9e4e94
changed ubuntu:latest to ubuntu:16.04
artourkin Apr 29, 2018
889b8a2
fixed loading of default template config
artourkin Apr 29, 2018
90ddea9
cleaned up unit tests
artourkin Apr 29, 2018
65efbe4
Removed features section
artourkin Apr 30, 2018
67e4e90
updated conflict resolution rule.
artourkin Sep 1, 2018
9cf7200
refactored sfd sampling algorithm
artourkin Sep 14, 2019
7622cc3
Add sbt launcher
Apr 16, 2021
bfdb78f
Add sbt launcher
Apr 16, 2021
8cb235e
fix dependencies
May 8, 2021
a50c018
update Dockerfile
May 8, 2021
049f541
Update Dockerfile
May 8, 2021
c1edc90
Update Dockerfile
May 8, 2021
db81dd4
project clean up
May 8, 2021
4f2add9
project clean up
May 8, 2021
0620776
fix typo in dockerfile
May 8, 2021
177d7db
Fix broken export of sfd sampling results.
artourkin May 30, 2021
9dbbe28
Massive improvement of docker build time
artourkin May 30, 2021
ea9edb0
Fix NPEs when mongodb is not available
artourkin May 30, 2021
34c141d
Fix the bug when SFD exports an empty report
artourkin May 30, 2021
d206b27
Add sleep to the script
artourkin Dec 11, 2021
9e9812e
remove apt cache, allows builds on non linux OS
artourkin Dec 12, 2021
151866a
minor improvements
artourkin Dec 12, 2021
a2d1e9b
minor improvements
artourkin Dec 12, 2021
d7ab181
minor improvements
artourkin Dec 12, 2021
8172a44
update dockerfile
Jan 6, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Dockerfile
.idea/
*/target/*
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
.settings/
.DS_Store
target/
*/target/*
data/
bin/
c3po-core/output.xml
Expand All @@ -12,3 +13,8 @@ c3po-webapi/war/
*/*.csv
*/logs/
*/log
*.iml
.idea/
# vagrant related local settings, should never end up in the repo
.vagrant_settings/maven_settings.xml
.vagrant_settings/proxy_settings.conf
28 changes: 16 additions & 12 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,14 +1,18 @@
language: java

branches:
only:
- master
- integration

language: scala
scala:
- 2.10
jdk:
- oraclejdk6
- oraclejdk7
- openjdk6
- openjdk7
- oraclejdk8
cache:
directories:
- "$HOME/.ivy2/cache"
before_cache:
- rm -rf $HOME/.ivy2/cache/com.typesafe.play/*
- rm -rf $HOME/.ivy2/cache/scala_*/sbt_*/com.typesafe.play/*
- find $HOME/.ivy2/cache -name "ivydata-*.properties" -print0 | xargs -n10 -0 rm

services: mongodb
# See https://blog.travis-ci.com/2014-03-13-slack-notifications/
# created with travis encrypt command line tool
notifications:
slack:
secure: 3BJCD1KtQdPw+Q/eg4wN4DYHHXvS8/YIdNA1xdwPREuCe6rn4khhTu3HcREI07rG3wfQhphy+f1bs+A3K2h9SGVoa4tslng7Bg2jFlf50pXFJZhhXcHQxCApxhj93TP54SQxRFtYLSOueJa6YRWnqxJMZpuMnGOU9cY3iuYbIaExTncjAZdkLsZbFaJtGhI2PqgIyEPEGo8CMZ1EQs2EP+vWKAS0rsKYQNHPd2hp7Z1cHzU0w8SNOFmkgy11J/NFe/Of3Bt67PfIMUnxA61hB/Xl5llqCkWmf5shntyAGCo8bxqWHlK+O8+ZU49EODl+kChJklcQ7btPB7vc3AXFpDllegDg4d8dWszwbo7yX3zEjr4iQtv2j5QW6euHMW9LBkBiLBuAg5vEB2ERQDsz888YL2djRtATuvwS77HSznmnmsENYTsOdm+mjm+x2R9k1uZ68+z3qBoLhsVpsxRakJBuwcJO8EE0NUTQOTm/ftgJ0NQpf7TWfByOqDCv/bZeZU/lK71NLKARJbe5bC2QW0oD5LjINITT+nd5UFmkirmy10EVFiNvF+GLKNfSk3pVQ2lUVKrF22qI9d+wR+Oy5aZBNxKK7tySKiT2XNbgNdNzmX66bV2ib+zm4A3Te5dvetNMIFy8kcU34LTe5NIbZt/efGagpo+1ijhLS6gBiKk=
310 changes: 0 additions & 310 deletions CHANGELOG.html

This file was deleted.

65 changes: 65 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# syntax = docker/dockerfile:1.2

FROM ubuntu:18.04

RUN rm -f /etc/apt/apt.conf.d/docker-clean




RUN apt-get update \
&& apt-get -y install apt-transport-https \
&& apt-get update \
&& apt-get install -yqq --no-install-recommends openjdk-8-jdk screen \
&& rm -rf /var/lib/apt/lists/*

COPY . /c3po
WORKDIR /c3po

RUN ["/bin/bash", "-c", "./sbt clean compile assembly dist"]





FROM ubuntu:18.04


RUN apt-get update \
&& apt-get -y install apt-transport-https gnupg curl \
&& curl -fsSL https://www.mongodb.org/static/pgp/server-3.6.asc | apt-key add - \
&& echo "deb [ arch=amd64,arm64 ] http://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/3.6 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-3.6.list \
&& apt-get update \
&& apt-get install -yqq --no-install-recommends mongodb-org openjdk-8-jdk screen unzip

RUN apt-get clean autoclean \
&& apt-get autoremove --yes \
&& rm -rf /var/lib/apt \
&& rm -rf /var/lib/dpkg \
&& rm -rf /var/cache/apt

RUN rm -f /etc/apt/apt.conf.d/docker-clean
RUN mkdir /c3po
WORKDIR /c3po

COPY --from=0 /c3po/c3po-cmd/target/scala-2.11/c3po-cmd-assembly-0.1-SNAPSHOT.jar ./c3po-cmd.jar
COPY --from=0 /c3po/c3po-webapi/target/universal/c3po-webapi-0.1-SNAPSHOT.zip ./c3po-webapi.zip
RUN unzip c3po-webapi.zip && rm -rf c3po-webapi.zip

EXPOSE 9000

RUN echo "#!/bin/bash \n\
set -e \n\
echo The number of files found in /data/FITS: \n\
find /data/FITS -type f | wc -l \n\
echo 'Now, C3PO will import metadata from FITS files..' \n\
mkdir -p /data/db \n\
nohup mongod --dbpath /data/db & \n\
sleep 10 \n\
java -jar /c3po/c3po-cmd.jar gather -c indocker -i /data/FITS -r \n\
ls \n\
pwd \n\
./c3po-webapi-0.1-SNAPSHOT/bin/c3po-webapi -Dplay.crypto.secret=abcdefghijk \n\
" >> /import.sh

ENTRYPOINT ["bash","/import.sh"]
81 changes: 30 additions & 51 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,51 +1,30 @@
C3PO
===================================================
[![Build Status](https://travis-ci.org/peshkira/c3po.png?branch=master)](https://travis-ci.org/peshkira/c3po)

Clever, Crafty, Content Profiling of Objects (c3po) is a software tool, which uses meta data extracted from
files of a digital collection as input to generate a profile of the content set. It is designed in a way so
that different meta data formats originating from different tools can be easily integrated. Currently it
supports FITS meta data and Apache TIKA meta data.

The tool follows a three part profiling process and provides facilities for data export and further
analysis of the content, such as helpful visualisations of the meta data characteristics, partitioning
of the collection into homogeneous sets based on any known characteristic. For each chosen partition of
the content, a special machine-readable profile can be generated that contains aggregations and
distributions for many of the properties. The profile optionally contains the set of chosen sample objects
that are representative.

Releases
------------------------
Please refer to [BinTray](http://dl.bintray.com/peshkira/c3po)

Setup
------------------------
Please refer to the [Usage Guide](https://github.com/peshkira/c3po/wiki/Usage-Guide).

Development
------------------------
Please refer to the [Dev Guide](https://github.com/peshkira/c3po/wiki/Development-Guide).

Screenshot
------------------------
![Collection Overview](https://dl.dropbox.com/u/8290338/blog/c3po_overview.png "Collection Overview")

More Information
------------------------
You can find more information in the following links:
- [Website](http://ifs.tuwien.ac.at/imp/c3po)
- [Blog Post](http://www.openplanetsfoundation.org/blogs/2012-11-19-c3po-content-profiling-tool-preservation-analysis)
- [Screencast](https://vimeo.com/53069664)

Road Map
------------------------
* consolidate based on resource name
* bundle optional! FITS execution in c3po (to make it easier for demo purposes)
* create a consistent REST API
* refactor the web app to use the new REST API and the new core
* read data from memory instead of file system and allow adaptors to skip the
memory read
* make use of a controlled vocabulary for properties. If nothing better exists, then use FITS as default.
* implement HBASE backend
* ...
* scale to half a billion objects
# C3PO: Clever, Crafty Content Profiling of Objects

Analyze your content with C3PO.

### How to install and use

Installation and usage instructions are available in our wiki [here](https://github.com/datascience/c3po/wiki).

### Troubleshooting

If you encounter any problems, please let us know by submitting an issue [here](https://github.com/datascience/c3po/issues?state=open).

### Licence

C3PO is released under [Apache version 2.0 license](LICENSE.txt).

### Acknowledgements

Part of this work was supported by the Vienna Science and Technology Fund (WWTF) through project ICT12-046 (BenchmarkDP) and by the 7th Framework Program, IST, through the SCAPE project, Contract 270137.

### Support

This tool is supported by the [Open Planets Foundation](http://www.openplanetsfoundation.org).

### Roadmap

* Conflict resolution
* A controlled vocabulary for properties
* Templating mechanism

31 changes: 0 additions & 31 deletions c3po-api/pom.xml

This file was deleted.

Loading