forked from DMPRoadmap/roadmap
-
Notifications
You must be signed in to change notification settings - Fork 15
Updates to use ROR v2 format #778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
sfisher
wants to merge
4
commits into
v5
Choose a base branch
from
ror-version-update
base: v5
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
6724ac0
ignore .nvmrc since it's not in the project to tell nvm what version …
sfisher 5d66bbe
Changes to read ROR v2, basic functionality
sfisher 2f60435
prefer domain from the domains list if available.
sfisher 0ce24a0
Likely changes to get the tests/mocks/factories to work with ROR 2. …
sfisher File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -66,7 +66,7 @@ def fetch(force: false) | |
| if old_checksum_val == metadata[:checksum] | ||
| log_message(method: method, message: 'There is no new ROR file to process.') | ||
| else | ||
| download_file = download_file = metadata['key'] | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh wow. 🤣 good catch. wonder how long that's been like that |
||
| download_file = metadata['key'] | ||
| download_url = metadata.fetch('links', {}).fetch('download', metadata.fetch('links', {})['self']) | ||
| log_message(method: method, message: "New ROR file detected - checksum #{metadata[:checksum]}") | ||
| log_message(method: method, message: "Downloading #{download_file}") | ||
|
|
@@ -174,7 +174,7 @@ def process_ror_file(zip_file:, file:) | |
|
|
||
| log_message( | ||
| method: method, | ||
| message: "Unable to process record for: '#{hash&.fetch('name', 'unknown')}'", | ||
| message: "Unable to process record for: '#{hash.fetch('names', []).first&.fetch('value', 'unknown')}'", | ||
| info: false | ||
| ) | ||
| end | ||
|
|
@@ -204,14 +204,16 @@ def process_ror_record(record:, time:) | |
|
|
||
| registry_org = RegistryOrg.find_or_create_by(ror_id: record['id']) | ||
| registry_org.name = safe_string(value: org_name(item: record)) | ||
| registry_org.acronyms = record['acronyms'] | ||
| registry_org.aliases = record['aliases'] | ||
| registry_org.country = record['country'] | ||
| registry_org.acronyms = extract_names(item: record, type: 'acronym') | ||
| registry_org.aliases = extract_names(item: record, type: 'alias') | ||
| registry_org.country = extract_country(item: record) | ||
| registry_org.types = record['types'] | ||
| registry_org.language = org_language(item: record) | ||
| registry_org.file_timestamp = time.strftime('%Y-%m-%d %H:%M:%S') | ||
| registry_org.fundref_id = fundref_id(item: record) | ||
| registry_org.home_page = safe_string(value: record.fetch('links', []).first) | ||
|
|
||
| website = record.fetch('links', []).find { |l| l['type'] == 'website' } | ||
| registry_org.home_page = safe_string(value: website ? website['value'] : nil) | ||
|
|
||
| # Attempt to find a matching Org record | ||
| registry_org.org_id = check_for_org_association(registry_org: registry_org) | ||
|
|
@@ -250,54 +252,85 @@ def check_for_org_association(registry_org:) | |
| # "Example College (example.edu)" | ||
| # "Example College (Brazil)" | ||
| def org_name(item:) | ||
| return '' unless item.present? && item['name'].present? | ||
| return '' unless item.present? && item['names'].present? | ||
|
|
||
| # Find ror_display name | ||
| name_obj = item['names'].find { |n| n['types']&.include?('ror_display') } | ||
| name = name_obj ? name_obj['value'] : item['names'].first['value'] | ||
|
|
||
| return '' if name.blank? | ||
|
|
||
| country = extract_country(item: item)&.fetch('country_name', '') | ||
|
|
||
| # Try to get the domain from the 'domains' array first | ||
| website = item.fetch('domains', []).first | ||
| # Fallback to extracting it from the website link | ||
| website = org_website(item: item) if website.blank? | ||
|
|
||
| country = item.fetch('country', {}).fetch('country_name', '') | ||
| website = org_website(item: item) | ||
| # If no website or country then just return the name | ||
| return item['name'] unless website.present? || country.present? | ||
| return name unless website.present? || country.present? | ||
|
|
||
| # Otherwise return the contextualized name | ||
| "#{item['name']} (#{website || country})" | ||
| "#{name} (#{website || country})" | ||
| end | ||
|
|
||
| # Extracts the org's ISO639 if available | ||
| def org_language(item:) | ||
| dflt = I18n.default_locale || 'en' | ||
| return dflt if item.blank? | ||
|
|
||
| country = item.fetch('country', {}).fetch('country_code', '') | ||
| labels = case country | ||
| when 'US' | ||
| [{ iso639: 'en' }] | ||
| else | ||
| item.fetch('labels', [{ iso639: dflt }]) | ||
| end | ||
| labels.first&.fetch('iso639', I18n.default_locale) || dflt | ||
| # Try to get language from ror_display name | ||
| name_obj = item.fetch('names', []).find { |n| n['types']&.include?('ror_display') } | ||
| return name_obj['lang'] if name_obj.present? && name_obj['lang'].present? | ||
|
|
||
| dflt | ||
| end | ||
|
|
||
| # Extracts the website domain from the item | ||
| def org_website(item:) | ||
| return nil unless item.present? && item.fetch('links', [])&.any? | ||
| return nil if item['links'].first.blank? | ||
|
|
||
| website_obj = item['links'].find { |l| l['type'] == 'website' } | ||
| return nil unless website_obj.present? && website_obj['value'].present? | ||
|
|
||
| # A website was found, so extract just the domain without the www | ||
| domain_regex = %r{^(?:http://|www\.|https://)([^/]+)} | ||
| website = item['links'].first.scan(domain_regex).last.first | ||
| website.gsub('www.', '') | ||
| website = website_obj['value'].scan(domain_regex).last&.first | ||
| website&.gsub('www.', '') | ||
| end | ||
|
|
||
| # Extracts the FundRef Id if available | ||
| def fundref_id(item:) | ||
| return '' unless item.present? && item['external_ids'].present? | ||
| return '' unless item['external_ids'].fetch('FundRef', {}).any? | ||
|
|
||
| fundref = item['external_ids'].find { |id| id['type'] == 'fundref' } | ||
| return '' unless fundref.present? | ||
|
|
||
| return fundref['preferred'] if fundref['preferred'].present? | ||
|
|
||
| fundref.fetch('all', []).first | ||
| end | ||
|
|
||
| # If a preferred Id was specified then use it | ||
| ret = item['external_ids'].fetch('FundRef', {}).fetch('preferred', '') | ||
| return ret if ret.present? | ||
| # Helper to extract names by type | ||
| def extract_names(item:, type:) | ||
| return [] unless item.present? && item['names'].present? | ||
|
|
||
| item['names'].select { |n| n['types']&.include?(type) }.map { |n| n['value'] } | ||
| end | ||
|
|
||
| # Otherwise take the first one listed | ||
| item['external_ids'].fetch('FundRef', {}).fetch('all', []).first | ||
| # Helper to extract country | ||
| def extract_country(item:) | ||
| return nil unless item.present? && item['locations'].present? | ||
|
|
||
| # Assuming we take the first location | ||
| loc = item['locations'].first | ||
| return nil unless loc.present? && loc['geonames_details'].present? | ||
|
|
||
| details = loc['geonames_details'] | ||
| { | ||
| 'country_name' => details['country_name'], | ||
| 'country_code' => details['country_code'] | ||
| } | ||
| end | ||
| end | ||
| end | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L169: I believe you wouldn't want .nvmrc in your .gitignore file because you want to make sure that your team uses the same node version