Releases: se-sic/coronet
Releases · se-sic/coronet
v5.1
5.1
Changes in detail
Added
- Add the possibility to split networks that contain simplified edges (PR #278, 9798d33, 0ed437c, 67a6651, 98ef831, 7ec4d83, 637d62a, 2c70666, 1cbc6fa, 41788ff, b042c0d, 36d23d6, 402c256, 54af2b1, 894414a, 0fe32a2)
- Add functionality for commit-message content analysis, such as NLP tools including stemming, tokenization, and lemmatization, as well as a function to search for keywords in commit messages and a function to measure the length of the messages (PR #281, 5aa4e41, 99f0638, e469d3a, 7d8fd39, ef689f7, 6e64224, f544394, dd9246b)
- Add
unlist.timestamps.if.possibleparameter toget.edgelist.with.timestampswhich allows callers to request a conversion of the timestamps from list to vector if possible, i.e., when there are no simplified edges in the network (PR #289, 39bf1dd)
Changed/Improved
- For consistency reasons: Ensure that the values of edge attributes are always lists even when they represent singular values (PR #278, 6fae184, 416c817)
- Reduce the amount of redundantly built networks by caching network data internally. This should improve the performance of building multi-networks, especially, when parts of the multi-networks have been built before (#119, PR #282, 06a814c, 4793eab, 8ba907f, 28d2290, 3608214, b30c7f2, 1fa340d, 40cd554, 8fcc744, ca348f1, 1d233af, 5dd5fc1)
- Internally cache commit-network data and bipartite-network data similarly to how we cache network data for author-, and artifact-networks (PR #282, PR #285, 3608214, aa7e3fa)
- Ensure that configured or implicitly-enforced undirectedness in partial networks is always dominant over configured or implicitly-enforced directedness. Furthermore, ensure consistency in the directedness used for edge generation and as a network attribute, especially in networks that consist of multiple partial networks such as multi-networks (PR #282, 65ead39, a776caf, 257a1c8, 41cff01)
- Remove redundant entries from the list of allowed edge attributes and instead add
event.info.1andevent.info.2(PR #282, 1b156c1) - Allow the issue data attributes
event.info.1andevent.info.2on network edges (PR #282, 1b156c1) - Add a
network.typeparameter toget.networksin which the caller can specify the types of networks to be constructed. This improves performance in cases where not all network types are needed, such as when building multi-networks (PR #285, bc2efd6, e9a0c16) - Rename the
list.attributesparameter inadd.vertex.attributeandsplit.and.add.vertex.attributetoflatten.valueswith inverted semantics and introduce documentation for it to improve comprehensibility (PR #285, 7dab04a) - Sort author data by
author.nameinstead ofauthor.idwhen reading it from a file (PR #286, 61b538b) - Enhance codeface testing data by ensuring that commit ids are unique between proximity and feature data and by adding commit data that includes (1) different commits that touch the same file / function, (2) commits that are authored at the same time by different authors (PR #286, 7481099, 3e53285)
- Deprecate support for R version 4.0 because of breaking dependencies (PR #281, 3dc91b1)
Fixed
- Fix a bug in
construct.edge.list.from.key.value.listthat could cause a crash when constructing a network where different vertices have identical associated timestamps (PR #285, d694a68) - Fix a bug in network construction that could lead to edges having an unwanted
author.name.1attribute (PR #285, 105fec1) - Ensure that POSIXct values are correctly handled in
add.vertex.attribute, i.e., that they are not converted to numeric values (PR #285, 7dab04a, 4924ac2) - Handle empty edges when constructing commit networks using commit-interaction data (PR #285, d5e1e48)
- Correctly retain order of commit and mail data when merging it with PAStA, synchronicity, and commit message data (PR #286, 50b9b68)
- Fix
get.edgelist.with.timestampsto work correctly on networks with dates in default (list) format (PR #289, 35b34bf)
v5.0
5.0
Changes in detail
Announcement
coronetis not compatible withigraphversions below 2.1.0 anymore. This is due to the simultaneous deprecation ofsubgraph.edgesand the introduction of the replacement for it,subgraph_from_edges, inigraphversion 2.1.0.- The plotting module of
coronetis not compatible withggplot2versions below 3.5.0 anymore. This is due to the simultaneous deprecation of thescale_nameparameter ofdiscrete_scale(which is used within the functionplot.networkofcoronet) inggplot2version 3.5.0.
Added
- Add commit-interaction data and add functions
read.commit.interactionsfor reading, as well asget.commit.interactions,set.commit.interactions, and utility functions for working with commit-interaction data (PR #252, d82857f, b4fd2a2, fd0aa05, bca3576, PR #263, 849123a, 3fb7437, 170bc66, f591528, 4c4b654) as well as tests for these features (PR #252, eeba7e2, 8bb39f4, 54b6f65, 7a5497a, 7b8585f, ef72540) - Add commit-interaction networks that can be created with
create.author.networkorcreate.artifact.networkif theartifact.relationorauthor.relationis configured to becommit.interaction(PR #252, d82857f, 329d97e) as well as tests for these features (PR #252, 07e7ed7, 7068cfa) - Add commit network as a new type of network. It uses commits as vertices and connects them either via cochange or commit interactions. This includes adding new configuration parameters and the function
add.vertex.attribute.commit.networkfor adding vertex attributes to a commit network (PR #263, ab73271, ab73271, cd9a930) - Add the possibility to split data time-based by multiple data sources (PR #261, 1088395, e1f79fc, 0bb187f, 371a97a)
- Add
remove.duplicate.edgesfunction that takes a network as input and conflates identical edges (PR #268, d9a4be4, 0c2f47c, c6e90dd) - Add
cumulativeas an argument toconstruct.rangeswhich enables the creation of cumulative ranges from given revisions (PR #268, a135f6b, 8ec207f) - Add function
get.last.activity.datato compute developers' last activities in a project, as well as functionadd.vertex.attribute.author.last.activityto add a developer's date of last activity as vertex attribute to a network, as well as helper functionsget.aggregated.activity.dataandadd.vertex.attribute.author.aggregated.activityto allow for other activity aggregations than first and last activity (PR #275, 9f23161, 8660ed7) - Add four new metrics that can be used for the classification of authors into core and peripheral: betweenness, closeness, pagerank, and eccentricity (PR #276, e27acb5, 2178808)
- Add helper function for prefixing function names with file names in
util-read.R(PR #252, f8ea987) - Add line-based code coverage reports into CI pipeline. Coverage reports are generated by
coverage.R(PR #262, 10cac49, b3b9f4a, c815d18, e809352, 32d0482) - Add tests for uncovered functionality in
util-misc.Randutil-networks.R(PR #264, ff30f32, af80551)
Changed/Improved
- Breaking Change: Change the default representation of edge attributes from vectors to lists. This change is necessary for the interplay of
coronetnetworks with certainigraphfunctionality since igraph version 2.1.0 (PR #274, 1c35d1f, eda30b8, 0c6b2eb, 44c7b72, 7303eab, 0c27012) - Change the default value for the
issues.from.sourceconfiguration parameter. Instead of reading JIRA and GitHub issues together, which was the previous default, the new default value causes only GitHub issue data to be read. To restore the previous default behavior and read data from both issue sources, this now needs to be manually configured when needed. (PR #264, 5ff83c3, 8c8080c, 8bcbc81) - Replace deprecated
igraphfunctions by their preferred alternatives (PR #264, PR #268, PR #274, PR #279, 0df9d5b, 7ac840d, e3617b8, 4b0d522, f29662b) - Remove deprecated parameter of
ggplot2::discrete_scale(PR #279, 027ce79) - Deprecate support for R version 3.6 (PR #264, c8e6f45, fb3f547)
- Explicitly add R version 4.4 to the CI test pipeline (c8e6f45)
- Refactor function
construct.edge.list.from.key.value.listto be more readable (PR #263, 05c3bc0) - Update necessary
igraphversion to 2.1.0 inREADME.md(PR #274, 6c3bcd1) - Update necessary
ggplot2version to 3.5.0 inREADME.md(PR #279, 027ce79) - Include core/peripheral classification in
README.md(PR #276, 6101e11, c6744c0, 5fc2da5)
Fixed
- Fix the creation of edgelists for issue-based artifact-networks by correctly iterating over the issue data (PR #264, 321d850)
- Add range information to network-splits when splitting a network using
split.network.time.based.by.ranges. This effect also propagates intosplit.networks.time.based(PR #274, 87911ad) - Fix a bug in
extract.timestampsthat occurs when the firstdata.sourcecontains empty data and that leads to a return value of type numeric which should be POSIXct (PR #270, 10696e4, 646c01a) - Adjust
metrics.scale.freenessandmetrics.is.scale.freefunctions to be compatible with both older and newer igraph versions (PR #274, 4b0d522)
v4.4
4.4
Changes in detail
Announcement
- Due to a bug in package
igraph(igraph/rigraph#1158), which is present in their versions 2.0.0 to 2.0.3, the functionsmetrics.scale.freenessandmetrics.is.scale.freecan currently not be used with theseigraphversions. If you need to call any of these two functions, you either need to installigraphversion 1.6.0 or wait until the bug inigraphis fixed in a future version ofigraph.
Added
- Add issue-based artifact-networks, in which issues form vertices connected by edges that represent issue references. If possible, disambiguate duplicate JIRA issue references that originate from codeface-extraction (PR #244, PR #249, 98a93ee, 771bcc8, 368e792, fa3167c, 4646d581d5e1f63260692b396a8bd8f51b0da48fda, ed77bd7)
- Add a new
split.data.by.binsfunction (not to be confused with a previously existing function that had the same name and was renamed in this context), which splits data based on given activity-based bins (PR #244, ece569c, ed5feb2) - Add
get.bin.dates.from.rangesfunction to convert date ranges into bins format (PR #249, a1842e9, 858b181) - Add the possibility to simplify edges of multiple-relation networks into a single edge at all instead of a single edge per relation (PR #250, PR #255, 2105ea8, a34b5bd, 3451641, 78f4351, d310fdc, 58d77b0)
- Add network simplification to showcase file (PR #255, dc32d44)
- Add tests for network simplification (PR #255, 338b069, 8a6f47b, e01908c, 7b6848f, 666d784)
- Add an
assert.sparse.matrices.equalfunction to compare two sparse matrices for equality for testing purposes (PR #248, 9784cdf, d9f1a8d) - Add tests for file
util-networks-misc.R(#242, PR #248, PR #258, f3202a6, 030574b, 380b022, 8b803c5, 7335c3d, 6b600df, a53fab8, faf19fc)
Changed/Improved
- Add input validation for the
binsparameter insplit.data.time.basedandsplit.data.by.bins(PR #244, ed0a530, 5e5ecba) - Test for the presence and validity of the
binsattribute on network-, and data-splits (PR #249, c064aff, 93051ab) - Simplify call chain-, and branching-routes in network-splitting functions and consequently set the
binsattribute on every output network-split (while minimizing recalculations) (PR #249, #256, PR #257, a1842e9, 8695fbe) - Rename
split.data.by.binsintosplit.dataframe.by.binsas this it what it does (PR #244, ed5feb2) - Throw an error in
split.data.time.based.by.timestampsif no custom event timestamps are available in the ProjectData object (6305adc) - Enhance testing data by adding
add_linkandreferenced_byissue events, which connect issues to form edges in issue-based artifact-networks. This includes duplicate edge information in JIRA data as produced by codeface-extraction (PR #244, 9f840c0, ea4fe8d, 6eb7311) - Add a check for empty networks in the functions
metrics.scale.freenessandmetrics.is.scale.freeand returnNAif the network is empty (29418f2) - Enhance
get.author.names.from.networkandget.author.names.from.datato always have the same output format. Now it doesn't depend on theglobalflag anymore (PR #248, d87d325, ddbfe68) - Change
util-tensor.Rto correctly use the new output format ofget.author.names.from.network(PR #248, 72b663e) - Throw an error in
convert.adjacency.matrix.list.to.arrayif the function is called with wrong parameters (PR #248, ece2d38, 1a3e510) - Rename
compare.networkstoassert.networks.equalto better match the purpose of the function (PR #248, d9f1a8d) - Explicitly add R version 4.3 to the CI test pipeline (9f346d5)
Fixed
- Reformat
event.info.1column of issue data according to the <issue-%source-%id> format, if the content of theevent.info.1field references another issue (PR #244, 62ff9d0) - Rename vertex attribute
IssueEventtoIssuein multi-networks, to be consistent with bipartite-networks (PR #244, 26d7b7e) - Fix an issue in activity-based splitting where elements close to the border of bins might be assigned to the wrong bin. The issue was caused by the usage of
split.data.time.basedinsidesplit.data.activity.basedto split data into the previously derived bins, when elements close to bin borders share the same timestamps. It is fixed by replacingsplit.data.time.basedbysplit.data.by.bins(PR #244, ece569c) - Remove the last range when using a sliding-window approach and the last range's elements are fully contained in the second last range (PR #244, 48ef4fa, 943228f)
- Fix broken error logging in
metrics.smallworldness(03e0688) - Fix
get.expanded.adjacencyto work if the provided author list does not contain all authors from the network and add a warning when that happens since it causes some authors from the network to be lost in the resulting matrix (PR #248, ff59017) - Fix
get.expanded.adjacency.matricesto have correct names for the columns and rows (PR #248, PR #258, e72eff8, a53fab8) - Fix
get.expanded.adjacency.cumulatedso that it works ifweightedparameter is set toFALSE(PR #248, 2fb9a5d) - Fix multi-network construction to work with
igraphversion 2.0.1.1, which does not allow to add an empty list of vertices (PR #250, 5547896)
v4.3
4.3
Changes in detail
Added
- Add function
verify.data.frame.columnsto check that a dataframe includes all required columns, optionally with a specified datatype (PR #231, d1d9a03) - Add helper function
is.single.nato check whether an element is of length 1 and isNA(ddff2b8, ccfc2d1) - Add CI support for GitHub Actions (PR #234, fa1fc4a)
Changed/Improved
- Include structural verification to almost all functions that read dataframes from files or set a dataframe (setter-functions) (PR #231, b7a9588)
- Include removal of empty and deleted users in the setters of mails, commits, issues, and authors. For commits, also the
committer.namecolumn is now checked for deleted or empty users (PR #235, 08fbd3e) - Check for empty values (i.e., values of length < 1) when updating configuration attributes and throw an error if a value is empty (9f36c54)
Fixed
- Fix check for empty input files in utility read functions. Compared to unpresent files, empty files do not throw an error when reading them, a check for
nrow(commit.data) < 1is therefore required (PR #231, ecfa643) - Fix various problems regarding the default classes of edge attributes and vertex attributes, and also make sure that the edge attributes for bipartite edges are chosen correctly (PR #240, 4275b93, 98a6deb, b8232c0, a953555, 820a763)
- Add argument to
construct.edge.list.from.key.value.listfunction which differentiates if constructed edges are supposed to be artifact edges, in which case we check if theartifactattribute is present for edges and replace it byauthor.name. (PR #238, e2c9d6c, 7f42a91) - Change edge construction algorithm for cochange-based artifact networks to respect the temporal order of data. This avoids duplicate edges. (PR #238, e2c9d6c)
- Clarify that edges in issue-based artifact-networks are not available yet in
README.md. (PR #238, 18a54f0) - Fix bugs related to expanded adjacency matrices and update the initiation of sparse matrices to the most recent version of package Matrix, to replace deprecated and disfunct function calls. Due to this update, package versions prior to 1.3.0 of the Matrix package cannot be used any more. If the 'install.R' detects that a version prior to 1.3.0 is installed, it now automatically tries to re-install package Matrix once (PR #241, 573fab2, 2f06252)
- Prevent R warnings
'length(x) = 2 > 1' in coercion to 'logical(1)'inifconditions for updating configuration values, in update functions of additional data sources, and inget.first.activity.data()(PR #237, PR #241, ddff2b8, e1579ca) - Prevent R warnings
In xtfrm.data.frame(x) : cannot xtfrm data frames(PR #237, c24aee7) - Fix wrong bracket in pasted logging message (PR #241, 50c68cb)
- Replace deprecated R function calls (PR #237, ed43382)
v4.2
4.2
Changes in detail
Added
- Incorporate custom event timestamps, i.e., add a configuration entry to the project configuration that allows specifying a file from which timestamps can be read, as well as an entry that allows locking this data; add corresponding functions
get.custom.event.timestamps,set.custom.event.timestampsandclear.custom.event.timestamps(PR #227, 0aa3424, 0f237d0, c180398, 54e089d, 54673f8, c5f5403) - Add function
split.data.time.based.by.timestampsto allow using custom event timestamps for splitting. Alternatively, timestamps can be specified manually (PR #227, 5b8515f, 43f23a8) - Add the following vertex attributes for artifact vertices and corresponding helper functions (PR #229, 2072807, 51b5478, 56ed57a, 9b06036, 52d40ba, e91161c)
add.vertex.attribute.artifact.last.editedadd.vertex.attribute.mail.thread.contributer.count,get.mail.thread.contributor.countadd.vertex.attribute.mail.thread.message.count,get.mail.thread.message.countadd.vertex.attribute.mail.thread.start.date,get.mail.thread.start.dateadd.vertex.attribute.mail.thread.end.date,get.mail.thread.end.dateadd.vertex.attribute.mail.thread.originating.mailing.list,get.mail.thread.originating.mailing.listadd.vertex.attribute.issue.contributor.count,get.issue.contributor.countadd.vertex.attribute.issue.event.count,get.issue.event.countadd.vertex.attribute.issue.comment.event.count,get.issue.comment.countadd.vertex.attribute.issue.opened.date,get.issue.opened.dateadd.vertex.attribute.issue.closed.date,get.issue.closed.dateadd.vertex.attribute.issue.last.activity.date,get.issue.last.activity.dateadd.vertex.attribute.issue.title,get.issue.titleadd.vertex.attribute.pr.open.merged.or.closed,get.pr.open.merged.or.closedadd.vertex.attribute.issue.is.pull.request,get.issue.is.pull.request
Changed/Improved
- Breaking Change: Rename existing vertex attributes for author vertices to be distinguishable from attributes for artifact vertices. With this change, the first word after
add.vertex.attribute.now signifies the type of vertex the attribute applies to (PR #229, 75e8514)add.vertex.attribute.commit.count.author->add.vertex.attribute.author.commit.countadd.vertex.attribute.commit.count.author.not.committer->add.vertex.attribute.author.commit.count.not.committeradd.vertex.attribute.commit.count.committer->add.vertex.attribute.author.commit.count.committeradd.vertex.attribute.commit.count.committer.not.author->add.vertex.attribute.author.commit.count.committer.not.authoradd.vertex.attribute.commit.count.committer.and.author->add.vertex.attribute.author.commit.count.committer.and.authoradd.vertex.attribute.commit.count.committer.or.author->add.vertex.attribute.author.commit.count.committer.or.authoradd.vertex.attribute.artifact.count->add.vertex.attribute.author.artifact.countadd.vertex.attribute.mail.count->add.vertex.attribute.author.mail.countadd.vertex.attribute.mail.thread.count->add.vertex.attribute.author.mail.thread.countadd.vertex.attribute.issue.count->add.vertex.attribute.author.issue.countadd.vertex.attribute.issues.commented.count->add.vertex.attribute.author.issues.commented.countadd.vertex.attribute.issue.creation.count->add.vertex.attribute.author.issue.creation.countadd.vertex.attribute.issue.comment.count->add.vertex.attribute.author.issue.comment.countadd.vertex.attribute.first.activity->add.vertex.attribute.author.first.activityadd.vertex.attribute.active.ranges->add.vertex.attribute.author.active.ranges
- Add parameter
use.unfiltered.datatoadd.vertex.attribute.issue.*. This allows selecting whether the filtered or unfiltered issue data is used for calculating the attribute (PR #229, b77601d, 922258c) - Improve handling of issue type in vertex attribute name for
add.vertex.attribute.issue.*. The default attribute name still adjusts to the issue type, but this no longer happens if the same name is specified manually (PR #229, fe5dc61)
v4.1
4.1
Changes in detail
Added
- Incorporate gender data, i.e., add a configuration entry to the project configuration, add function
read.genderfor reading gender data, add functionsget.genderandset.genderand corresponding utility functions to automatically merge gender data to the author data (PR #216, 8868ff4, bfbe4de, 0a23862, a7744b5, 6a50fd1, 413e24c, 39db315, 1e4026d)
Changed/Improved
- Add
modeparameter tometrics.vertex.degreesto allow choosing between indegree, outdegree, and total (#219, ae14eb4) - Adjust
.drone.ymlCI config to prevent pipeline fails:Rversion3.3is not tested any more as some packages are not available any more for thisRversion (ca6b474). Also another docker container in the CI pipeline is used as there are problems with the previously used docker instance (937f797)
Fixed
- Fix values in test for the eigenvector centrality as igraph has changed the calculation of this with version 1.2.7. Also put a warning that we recommend version 1.3.0 in
install.Rand document it in theREADME.md(25fb862, 1bcbca9) - Fix the filtering of the deleted user in
util-read.Rto always be lowercase as the deleted user can appear with different spellings (#214, 1b4072c) - Add check to
get.first.activity.datato look for missing activity types. If no activities are in the RangeData, the function will print a warning and return an empty list (PR #220, #217, 5707517, 42a4bef, d6424c0, ca8a1b4, f6553c6)
v4.0
4.0
Changes in detail
Announcement
coronetnow has a logo and a website: https://se-sic.github.io/coronet (#167, PR #196)
Added
- Add functionality to read and process commit messages in order to merge them to the commit data (see issue #180). Three values are available for the new attribute
commit.messagesinProjectConf:none,titleandmessages(PR #193, 85b1d05, fdc414a, 43e1894) - Add functions
cleanup.commit.message.dataandcleanup.synchronicity.datato remove commit hashes that are not any more present in the commit data from the commit message data or synchronicity data (PR #193, 98e83b0) - Add function
metrics.is.smallworldto the metrics module in order to unify checks for smallworldness (similar to scalefreeness) (PR #195, ce1f812) - Add function
metrics.vertex.centralitiesto metrics module in order to simplify getting a data frame containing author names and their respective centrality values (d3cd528, e7182e7) - Add function
get.data.sources.from.relationstoutil-networks.Rwhich extracts the data sources of a network that were used when building it (PR #195, d1e4413) - Add tests for the
get.data.sources.from.relationsfunction (PR #195, add0c74) - Add logo directory containing several logo variants (PR #196, 82f9971, dc4659e, fdc5e67, 752a9b3)
- Add function
preprocess.issue.data, which implements common issue data filtering operations. (fcf5cee, a566cae, 5ba6feb) - Add function
get.issues.uncached, which gets the issues filtered without poisoning or using the cache. (eb919fa) - Add function
get.issues.unfilteredto get the unfiltered issues so that these methods follow the naming scheme known from the respective methods for commits (b9dd94c, e05f344) - Add per-author vertex attributes regarding counting of issues, issue-creations, issue-comments, mails, mail-threads, ... (like mail thread count, issue creation count) (PR #194, issue #188, 9f9150a, 7260d62, 139f70b, eb4f649, 627873c, 1e1406f, 98e11ab, a566cae)
- Add functionality that allows to read any data source at any point in time, even after splitting. In this case, the read data is automatically cut to the corresponding range on the
RangeDataobject (PR #201, 7f9394f). Additionally, when changing the configuration parameters concerning additional data sources, the environment of aProjectDataobject is no longer reset (PR #201, eed45ac) - Add new configuration parameters
commits.locked,mails.lockedandissues.lockedtoProjectConfwhich, when set toTRUE, prevent the respective getters from triggering the read of the data if it is not present yet (PR #201, 3821677) - Add support for classifying developers on the basis of more count-based classification metrics, including mail-count, mail-thread count, issue-count, issue-comment count, issue-commented-in count, and issue-created count (issue #70, PR #209, d7b2455, 6f737c8)
- Add bot filtering mechanism, which allows removing issues/mails/commits made by bots (838855f, dcce82d)
- Ignore the "deleted user", as well as the author having an empty name "" (1a08140, 24c222a)
Changed/Improved
- Breaking Change: Rename getters for main data sources: Unfiltered date is now acquired using
get.<datasource>.unfiltered, filtered data is acquired usingget.<datasource>(edf19cf, e05f344) - Add check for empty network in
metrics.hub.degreefunction. In the case of an empty network, a warning is being printed andNAis returned (PR #195, 4b164be) - Adjust the function
ProjectData$get.artifacts: Rename its single parameter todata.sourcesand change the function so that it can extract the artifacts for multiple data sources at once. The default is still that only artifacts from the commit data are extracted. (PR #195, cf795f2, 70c05ec, 5a46ff4, fd767bb) - Change the internal representation of empty data from
NULLto empty data frames and adapt functionget.cached.data.sources()ofProjectDatawhich returns a vector of all data sources that are cached (including additional and filtered data sources) (PR #201, aec898e, e55d088, 24c222a); additionally, introduce new functionis.data.source.cached()inutil-data.Rthat returns a logical vector indicating which of the given data sources are cached (PR #201, b49cc5d, 491e70c, 24c222a) - Change the threshold calculation for the classification of developers to use a quantile approach when classifying on the basis of network centrality metrics (issue #205, PR #209, PR #210, 5128252, 0d6a3a1)
- Update documentation in
util-network-metrics.Randutil.conf.R(PR #195, f929248, de9988c, PR #199, 059b286) - Splitting no longer loads all (additional) data sources, but only the ones that have already been cached in the
ProjectData(PR #201, 52a3014, aec898e, de1bbfe) - Improve the documentation in
util-core-peripheral.Rby adding roxygen skeleton documentation to undocumented functions (issue #70, PR #209, a3d5ca7, 6f737c8) - Change the
$notation to the bracket notation inutil-core-peripehral.R(issue #70, PR #209, 6f737c8) - Add
.drone.ymlto enable running our CI pipelines on drone.io (PR #191, 1c5804b) - Not only run test suite in our CI pipeline, but also run the showcase file in our CI pipeline using test data (719a4f0, 3eb31d8)
- Add R version 4.1 to test suite and adjust missing time-zone attributes on
NAvectors or empty POSIXct vectors which are correctly added as of R version 4.1 (PR #203, 6b7fb36, 98c5671, 09d11ab)
Fixed
- Fix fencing issue timing data so that issue events "happen" after the issue was created. Since only
commit_addedevents are affected, that only happens for these. (issue #185, 627873c, 6ff585d) - Fix the function
reset.environment()of both theProjectDataandNetworkBuilderclass; they now reset all the data (PR #199, de091a5, fc4c086) - Adjust the functions
update.commit.message.data(),update.pasta.data(), andupdate.synchronicity.data(): no warning is being printed anymore when being called by the corresponding cleanup function (PR #199, e5c60a5) - Fix issue where the data path on
RangeDataobjects was wrong in special cases. Introduce the (private) flagbuilt.from.range.data.readthat is set according to how the object has been created (splitting manually or reading codeface ranges) and calculating the data path accordingly (PR #199, cce9527, 917bf64, 169c034). Also add tests for this new behaviour (PR #199, ef5bac6, 3aa8e7d, d454e5a, 66ad127) - Make splitting no longer modify the original
ProjectConf, instead create a copy (e82d056) - Fix and update outdated examples in the showcase file (473c094, 287fbfa, 0a5cce4, PR #207)
- Fix generation of Codeface range directory names from commit hashes (5c90d1c)
- Fix plotting an empty network via
plot.network(03f986d) - Fix behavior of
construct.rangeswhen only one range has to bee constructed andsliding.window = TRUE(000314b) - Add package
reshape2to the install script as this package is used in module `util-plot-...
v3.7
3.7
Changes in detail
Added
- Add a new file
util-tensor.Rcontaining the classFourthOrderTensorto create (author x relation x author x relation) tensors from a list of networks (with each network having a different relation) and its corresponding utility functionget.author.networks.for.multiple.relations(PR #173, c136b1f, e4ee0dc, 051a5f0) - Add function
calculate.EDCPTD.centralityfor calculating the EDCPTD centrality for a fourth-order tensor in the above described form (c136b1f, e4ee0dc, 051a5f0) - Add new file
util-networks-misc.Rwhich contains miscellaneous functions for processing network data and creating and converting various kinds of adjacency matrices:get.author.names.from.networks,get.author.names.from.data,get.expanded.adjacency,get.expanded.adjacency.matrices,get.expanded.adjacency.matrices.cumulated,convert.adjacency.matrix.list.to.array(051a5f0) - Add tests for sliding-window functionality and make parameterized tests possible (a3ad0a8, 2ed84ac, PR #184)
- Add function
cleanup.pasta.datato remove wrong commit hashes and message ids from the PaStA data (1797e03, PR #189)
Changed/Improved
- Adjust the function
get.authors.by.data.source: Rename its single parameter todata.sourcesand change the function so that it can extract the authors for multiple data sources at once. The default value of the parameter is a vector containing all the available data sources (commits, mails, issues) (051a5f0) - Adjust recommended R version to 3.6.3 in README (92be262)
- Add R version 4.0 to test suite and adjust package installation in
install.Rto improve compatibility with Travis CI (40aa0d8, 1ba0367, #161)
Fixed
- Fix sliding-window creation in various splitting functions (
split.network.time.based,split.networks.time.based,split.data.time.based,split.data.activity.based,split.network.activity.based) and also fix the computation of overlapping ranges in the functionconstruct.overlapping.rangesto make sure that the last and the second-last range do not cover the same range) (1abc1b8, c34c42a, 097cebc, 9a1b651, 0fc179e, cad28bf, 7602af2, PR #184) - Fix off-by-1 error in the function
get.data.cut.to.same.date(f0744c0) - Fix missing or wrongly set layout when plotting networks (#186, 720cc7b, 877931b)
- Fix reading of the PaStA data since the file format has changed (712bbaf, PR #189)
- Fix bug that duplicates revision set ids in the mail and commit data when merging the PaStA data and also copy-paste error when merging PaStA data to commit data (1797e03, PR #189)
- Fix bug that results in an error when there is a variable called 'c' in the R environment (de42eb2, PR #189)
- Fix bug that when applying
filter.patchstack.mails()to an environment with no mail data, the mail data gets set toNULL(8261475, PR #189)
v3.6
3.6
Changes in detail
Added
- Add a parameter
editor.definitionto the functionadd.vertex.attribute.artifact.editor.countwhich can be used to define, if author or committer or both count as editors when computing the attribute values. (#92, ff1e147) - Add the possibility to filter out patchstack mails from the mails of the
ProjectData. The option can be toggled using the newly added configuration optionmails.filter.patchstack.mails. (1608e28, a932c8c) - Add a new file
util-plot-evaluation.Rcontaining functions to plot commit edit types per author and project. (PR #171, d4af515, aa542a2. 0a0a590)
Changed/Improved
- Add R version 3.6 to test suite (8b2a52d, #161)
- Update
.travis.ymlto improve compatibility with Travis CI (41ce589)
Fixed
- Ensure sorting of commit-count and LOC-count data.frames to fix tests with R 3.3 (33d63fd)
v3.5
3.5
Announcement
- Rename project to
coronet(#10, 929f8ce, ac1ce80)- Be sure to update Git remotes and submodules to the new URL!
Changes in detail
Added
- Add the constants
UNTRACKED.FILE,UNTRACKED.FILE.EMPTY.ARTIFACT, andUNTRACKED.FILE.EMPTY.ARTIFACT.TYPE: Commits that do not change any artifact are considered to be carried out on a meta-file called<untracked.file>. The constantUNTRACKED.FILEis added to hold the string constant. Analogously, the constantsUNTRACKED.FILE.EMPTY.ARTIFACT(currently,"") andUNTRACKED.FILE.EMPTY.ARTIFACT.TYPE(currently,"") hold the constants for any artifacts and their corresponding types, respectively, "changed" in untracked files. (11428d9, 5ea65b9, dde0dd7, 2284bbe) - Add the public method
ProjectData$get.commits.filtered.uncached: The method allows for external filtering of the commits by specifying if untracked files and/or the base artifact should be filtered (this method does not take advantage of caching, whereas the methodProjectData$get.commits.filtereddoes) (11428d9) - Add the parameters
commits.filter.base.artifactandcommits.filter.untracked.filesto theProjectConf: In addition to theProjectConfparametercommits.filter.base.artifact(previously calledartifact.filter.base), which configured whether the base artifact should be included in theget.commits.filteredmethod, there is now a similar parameter calledcommits.filter.untracked.filesdoing the same thing for untracked files (11428d9, 466d8eb) - Add parameter
edges.for.base.artifactstoNetworkConf: In author networks, edges do not get constructed anymore between authors for solely modifying untracked files. For authors involved in changing the base artifact, it can be configured whether edges should be created or not using the newNetworkConfparameteredges.for.base.artifacts(c60c2f6, 466d8eb) - Add method
ProjectData$get.authors.by.data.sourceto retrieve authors by given data-source name (#149, 6580427, 137d833) - Add helper function
create.empty.data.frame: The function returns empty data.frames (0 rows) with correct columns and, if specified, all the correct data types. In the future, functions, that return data in data.frames, should always return data.frames of the same shape (regarding columns and data types) – especially when they are empty – because this makes later case distinctions easier or unnecessary (67a4fbe, 3513647) - For the most common types of data.frames (data.frames of commits, mails, issues, and authors) four more utility methods are added, namely
create.empty.authors.list,create.empty.commits.list,create.empty.issues.list,create.empty.mails.list,create.empty.synchronicity.list,create.empty.pasta.listas well as corresponding constants holding columns and associated data types for all these empty data.frames (5f0f529, 523daef, f8e021d, 3513647, 2f4e6f0, cd3e34a) - Add mandatory attributes in
create.empty.networkif wanted (cae9d4b, cc8bd86) - Add function
create.empty.vertex.list(c00101d) - Add tests for construction of networks without data (a4b3524)
- Add tests for construction of networks without vertices (6eb214c)
- Add a note on mailing-list threads to README (c6dca27)
- Add cutting functionality to README descriptions (fb40c50)
- Add the parameter
restrict.classification.to.authorsto the functionsget.author.class.by.type,get.author.class.overview,get.author.class.network.degree,get.author.class.network.eigen,get.author.class.network.hierarchy,get.author.class.commit.countandget.author.class.loc.count. The parameter allows to perform classifications on a limited group of authors whose names are specified in this parameter. (2492dd0, #148) - Add test cases for
util-core-peripheral.Rby adding the new filetest-core-peripheral.Ralong with test cases (2627d6c) - Add project-configuration parameter
issues.from.sourceto choose if only issues from JIRA, only issues from GitHub, or all issues shall be read in (PR #159, d677949, a3e7132, ea26181). Therefore two test cases, one that reads in only JIRA issues and one that reads in only GitHub issues, are added to the issue read test (65b1acd, 2d897cb) - Add class documentation (#157, 6e33d0a, 250f9e0)
Changed/Improved
- Always add mandatory vertex and edge attributes (#154, 0526755)
- Heavily improve addition of PaStA data (cd3e34a)
- The method
read.issuesinutil-read.Rnow supports the new issue data format (PR #147, 77c750c, e04ce30, 67b818a, 4020487, 3513647). Therefore, the test issue data and all related tests are updated (39971ee, 0ec6c6c, 6a9f4ad, fda000f, 3513647) - Rename
ProjectConfparameterartifact.filter.basetocommits.filter.base.artifact(PR #149, 466d8eb) - The constant
BASE.ARTIFACTSis extended by adding untracked files (i.e. the new meta-fileUNTRACKED.FILE), which is now considered to be a new base artifact in the case of file-level analyses. This implies, that, in case of file-level analyses, the base artifact and the untracked files fall together, while in feature-level and function-level analyses they are treated differently (d11d0fb) - Filtering by artifact kind (e.g. filtering out either
"Feature"or"FeatureExpression") is now being done in the methodProjectData$get.commitsinstead of the methodProjectData$get.commits.filtered(894c9a5) - Remove
get.commits.filtered.emptyand correspondingfilter.commits.emptymethod, the functionality is now included into the methodsget.commits.filteredandfilter.commitsrespectively (11428d9) - The private method
ProjectData$filter.commitsnow takes parameters which configure whether untracked files and/or the base artifact are to be filtered (11428d9) - Remove
get.commits.raw,set.commits.rawandread.commits.rawfunctions (64a9486, c26e582) - Add commits on untracked files to test suite (#153, d9f527c)
- In the class
Conf(and its sub-classesNetworkConfandProjectConf), default parameters are not validated anymore to avoid confusion by logging output (ec8c6dd) - In the class
Conf(and its sub-classesNetworkConfandProjectConf),stopis called on errors during parameter updates now (ec8c6dd) - Change shape of
Verticesin the legend of plots to avoid confusion (f4fb480) - Refactor
ProjectData$get.cached.data.sourcesto be more concise (a4e7a21) - Update contribution guide regarding
roxygen2conventions (#157, fbc2d54, 783ee58, 6e33d0a) - Update README regarding mandatory edge attributes (641624b)
- Rename misleading parameter names for functions
get.author.class.by.type,get.author.class.overview,get.author.class.network.degree,get.author.class.network.eigen,get.author.class.network.hierarchy,get.author.class.commit.countandget.author.class.loc.count. Most importantly, the parameterrange.datawas renamed toproj.datafor these functions. (587ef99, 81568b1, #70) - Remove the unused functions
get.commit.count.thresholdandget.loc.count.threshold. (2534d73, #70) - The function
verify.argument.for.parameterwas adjusted to be suitable in more general use-cases (557bdcd) - Do not redundantly initialize data sources when splitting (35698a1)
- Read PaStA and synchronicity data only if enabled (79bf3ca)
- Add and enforce coding convention to use 'vertices' and not 'nodes'. Most importantly, the function
metrics.node.degreesis renamed tometrics.vertex.degrees. (d35ce61...