Publications: Juliana Freire

← Back to Author Profile

  • Affiliation: New York University
  • Google Scholar ID: sSzAlq0AAAAJ
  • Total Publications: 323

Download CSV for Juliana Freire

Title Year Citations Score
Reproducibility and Replicability in Science
https://www.nap.edu/catalog/25303/reproducibility-and-replicability-in-science, 2019
View Details
2019 945 99.2%
The open provenance model core specification (v1. 1)
Future generation computer systems 27 (6), 743-756, 2011
View Details
2011 1073 98.2%
Provenance and scientific workflows: challenges and opportunities
Proceedings of the 2008 ACM SIGMOD international conference on Management of …, 2008
View Details
2008 981 97.7%
Visual exploration of big spatio-temporal urban data: A study of new york city taxi trips
IEEE transactions on visualization and computer graphics 19 (12), 2149-2158, 2013
View Details
2013 709 97.5%
Provenance for computational tasks: A survey
Computing in science & engineering 10 (3), 11-21, 2008
View Details
2008 755 96.7%
The ALPS project release 2.0: open source software for strongly correlated systems
Journal of Statistical Mechanics: Theory and Experiment 2011 (05), P05001, 2011
View Details
2011 669 96.7%
VisTrails: visualization meets data management
Proceedings of the 2006 ACM SIGMOD international conference on Management of …, 2006
View Details
2006 749 96.6%
A Large-scale Study about Quality and Reproducibility of Jupyter Notebooks
2019 IEEE/ACM 16th International Conference on Mining Software Repositories …, 2019
View Details
2019 299 96.3%
Method and apparatus for web-site-independent personalization from multiple sites having user-determined extraction functionality
US Patent 6,976,210, 2005
View Details
2005 538 94.6%
Vistrails: Enabling interactive multiple-view visualizations
VIS 05. IEEE Visualization, 2005., 135-142, 2005
View Details
2005 527 94.4%
A large-scale study about quality and reproducibility of jupyter notebooks
2019 IEEE/ACM 16th international conference on mining software repositories …, 2019
View Details
2019 188 93.4%
From XML schema to relations: A cost-based approach to XML storage
Proceedings 18th International Conference on Data Engineering, 64-75, 2002
View Details
2002 519 93.2%
From XML schema to relations: A cost-based approach to XML storage
Proceedings 18th International Conference on Data Engineering, 64-75, 2002
View Details
2002 517 93.2%
Auctus: A dataset search engine for data augmentation
arXiv preprint arXiv:2102.05716, 2021
View Details
2021 98 92.4%
Reprozip: Using provenance to support computational reproducibility
Presented as part of the 5th {USENIX} Workshop on the Theory and Practice of …, 2013
View Details
2013 281 91.9%
Understanding and improving the quality and reproducibility of Jupyter notebooks
Empirical Software Engineering 26 (4), 65, 2021
View Details
2021 89 91.5%
Managing rapidly-evolving scientific workflows
Provenance and Annotation of Data: International Provenance and Annotation …, 2006
View Details
2006 362 91.4%
The Seattle report on database research
ACM Sigmod Record 48 (4), 44-53, 2020
View Details
2020 117 90.7%
A topic-agnostic approach for identifying fake news pages
Companion proceedings of the 2019 World Wide Web conference, 975-980, 2019
View Details
2019 138 90.6%
The first provenance challenge
Concurrency and computation: practice and experience 20 (5), 409-418, 2008
View Details
2008 307 90.4%
The open provenance model: An overview
International provenance and annotation workshop, 323-326, 2008
View Details
2008 305 90.4%
The open provenance model: An overview
International provenance and annotation workshop, 323-326, 2008
View Details
2008 297 90.0%
The seattle report on database research
Communications of the ACM 65 (8), 72-79, 2022
View Details
2022 44 88.9%
Predicting taxi demand at high spatial resolution: Approaching the limit of predictability
2016 IEEE international conference on Big data (big data), 833-842, 2016
View Details
2016 172 88.9%
An adaptive crawler for locating hidden-web entry points
Proceedings of the 16th international conference on World Wide Web, 441-450, 2007
View Details
2007 266 88.2%
ArcheType: A Novel Framework for Open-Source Column Type Annotation Using Large Language Models
Proceedings of the VLDB Endowment 17 (9), 2279-2292, 2024
View Details
2024 17 87.5%
Siphoning hidden-web data through keyword-based interfaces
Journal of Information and Data Management 1 (1), 133, 2010
View Details
2010 234 87.4%
AlphaD3M: Machine learning pipeline synthesis
AutoML Workshop at ICML, 2018
View Details
2018 127 86.9%
Analogy based workflow identification
US Patent 8,060,391, 2011
View Details
2011 203 86.5%
VeriWeb: Automatically testing dynamic web sites
In Proceedings of 11th International World Wide Web Conference (WWW’2002), 2002
View Details
2002 268 86.3%
noWorkflow: Capturing and Analyzing Provenance of Scripts
Provenance and Annotation of Data and Processes: 5th International …, 2014
View Details
2014 164 86.2%
Correlation sketches for approximate join-correlation queries
Proceedings of the 2021 International Conference on Management of Data, 1531 …, 2021
View Details
2021 57 85.7%
YesWorkflow: a user-oriented, language-independent tool for recovering workflow information from scripts
arXiv preprint arXiv:1502.02403, 2015
View Details
2015 149 85.5%
Method and apparatus for creating and providing personalized access to web content and services from terminals having diverse capabilities
US Patent App. 09/943,133, 2002
View Details
2002 248 85.3%
A sketch-based index for correlated dataset search
2022 IEEE 38th International Conference on Data Engineering (ICDE), 2928-2941, 2022
View Details
2022 33 84.6%
Provenance for visualizations: Reproducibility and beyond
Computing in Science & Engineering 9 (5), 82-89, 2007
View Details
2007 205 84.5%
Provenance for visualizations: Reproducibility and beyond
Computing in Science & Engineering 9 (5), 82-89, 2007
View Details
2007 201 84.2%
XSB: A System for Efficiently Computing
Logic Programming and Non-monotonic Reasoning: Proceedings of the …, 1997
View Details
1997 207 84.0%
XSB: A system for efficiently computing well-founded semantics
Logic Programming And Nonmonotonic Reasoning: 4th International Conference …, 1997
View Details
1997 207 84.0%
PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines
IEEE Transactions on Visualization and Computer Graphics, 2020
View Details
2020 70 83.6%
Searching for Hidden-Web Databases.
WebDB, 1-6, 2005
View Details
2005 205 83.5%
Automatic exploration and testing of dynamic Web sites
US Patent 7,716,322, 2010
View Details
2010 171 82.5%
Anonymizing nyc taxi data: Does it matter?
2016 IEEE international conference on data science and advanced analytics …, 2016
View Details
2016 111 82.2%
Managing the evolution of dataflows with vistrails
22nd International Conference on Data Engineering Workshops (ICDEW'06), 71-71, 2006
View Details
2006 176 81.8%
Managing the evolution of dataflows with vistrails
22nd International Conference on Data Engineering Workshops (ICDEW'06), 71-71, 2006
View Details
2006 172 81.4%
StatiX: making XML count
Proceedings of the 2002 ACM SIGMOD international conference on Management of …, 2002
View Details
2002 194 81.4%
Tackling the provenance challenge one layer at a time
Concurrency and Computation: Practice and Experience 20 (5), 473-483, 2008
View Details
2008 162 81.2%
Computational reproducibility: state-of-the-art, challenges, and database research opportunities
SIGMOD, 593-596, 2012
View Details
2012 135 80.6%
Querying and creating visualizations by analogy
IEEE transactions on Visualization and Computer Graphics 13 (6), 1560-1567, 2007
View Details
2007 161 80.5%
Automating Web navigation with the WebVCR
Computer Networks 33 (1-6), 503-517, 2000
View Details
2000 187 80.4%
Querying and re-using workflows with VisTrails
Proceedings of the 2008 ACM SIGMOD international conference on Management of …, 2008
View Details
2008 146 79.3%
A fast and robust method for web page template detection and removal
Proceedings of the 15th ACM international conference on Information and …, 2006
View Details
2006 153 79.2%
WebViews: accessing personalized web content and services
Proceedings of the 10th international conference on World Wide Web, 576-586, 2001
View Details
2001 167 78.7%
Viscomplete: Automating suggestions for visualization pipelines
IEEE Transactions on Visualization and Computer Graphics 14 (6), 1691-1698, 2008
View Details
2008 141 78.7%
The open provenance model
University of Southampton, 2007
View Details
2007 143 78.4%
Data-driven domain discovery for structured datasets
Proceedings of the VLDB Endowment 13 (7), 953-967, 2020
View Details
2020 50 77.1%
Scientific process automation and workflow management
Scientific Data Management: Challenges, Technology, and Deployment, 2009
View Details
2009 127 77.0%
Scientific Process Automation and Workflow Management.
Scientific Data Management 10 (3), 476-508, 2009
View Details
2009 127 77.0%
Stars: Simulating taxi ride sharing at scale
IEEE Transactions on Big Data 3 (3), 349-361, 2016
View Details
2016 86 77.0%
Structured open urban data: understanding the landscape
Big data 2 (3), 144-154, 2014
View Details
2014 99 76.9%
Data polygamy: The many-many relationships among urban spatio-temporal data sets
Proceedings of the 2016 International Conference on Management of Data, 1011 …, 2016
View Details
2016 83 76.2%
Using topological analysis to support event-guided exploration in urban data
IEEE transactions on visualization and computer graphics 20 (12), 2634-2643, 2014
View Details
2014 94 75.8%
Interactive data visualization in jupyter notebooks
Computing in Science & Engineering 23 (2), 99-106, 2021
View Details
2021 33 75.1%
noWorkflow: a tool for collecting, analyzing, and managing provenance from python scripts
Proceedings of the VLDB Endowment 10 (12), 2017
View Details
2017 71 74.6%
A survey on collecting, managing, and analyzing provenance from scripts
ACM Computing Surveys (CSUR) 52 (3), 1-38, 2019
View Details
2019 53 74.5%
Combining classifiers to identify online databases
Proceedings of the 16th international conference on World Wide Web, 431-440, 2007
View Details
2007 117 74.3%
GPU rasterization for real-time spatial aggregation over arbitrary polygons
Proceedings of the VLDB Endowment 11 (3), 2017
View Details
2017 70 74.2%
A layered architecture for querying dynamic web content
Proceedings of the 1999 ACM SIGMOD international conference on Management of …, 1999
View Details
1999 124 74.2%
Method for creating and playing back a smart bookmark that automatically retrieves a requested Web page through a plurality of intermediate Web pages
US Patent 6,535,912, 2003
View Details
2003 123 73.4%
A comprehensive solution to the XML-to-relational mapping problem
Proceedings of the 6th annual ACM international workshop on Web information …, 2004
View Details
2004 120 73.0%
Reproducibility of data-oriented experiments in e-science (dagstuhl seminar 16041)
Dagstuhl Reports 6 (1), 108-159, 2016
View Details
2016 72 73.0%
The XSB Programmers''Manual
View Details
2003 120 72.9%
Beyond depth-first: Improving tabled logic programs through alternative scheduling strategies
Programming Languages: Implementations, Logics, and Programs: 8th …, 1996
View Details
1996 105 72.8%
Making computations and publications reproducible with VisTrails
Computing in Science & Engineering 14 (4), 18-25, 2012
View Details
2012 90 72.1%
Automatic machine learning by pipeline synthesis using model-based reinforcement learning and a grammar
arXiv preprint arXiv:1905.10345, 2019
View Details
2019 47 71.7%
A scalable approach for data-driven taxi ride-sharing simulation
2015 IEEE International Conference on Big Data (Big Data), 888-897, 2015
View Details
2015 74 71.6%
Visus: An Interactive System for Automatic Machine Learning Model Building and Curation
ACM SIGMOD Workshop on Human-In-the-Loop Data Analytics (HILDA), 2019
View Details
2019 46 71.1%
Birdvis: Visualizing and understanding bird populations
IEEE transactions on visualization and computer graphics 17 (12), 2374-2383, 2011
View Details
2011 89 70.8%
Learning to extract form labels
Proceedings of the VLDB Endowment 1 (1), 684-694, 2008
View Details
2008 95 70.3%
Crowdlabs: Social analysis and visualization for the sciences
Scientific and Statistical Database Management: 23rd International …, 2011
View Details
2011 84 69.4%
Organizing hidden-web databases by clustering visible web documents
2007 IEEE 23rd International Conference on Data Engineering, 326-335, 2006
View Details
2006 92 69.0%
Data quality: The role of empiricism
ACM SIGMOD Record 46 (4), 35-43, 2018
View Details
2018 50 68.8%
Your notebook is not crumby enough, REPLace it
Conference on Innovative Data Systems Research (CIDR), 2020
View Details
2020 34 68.0%
Proactive Discovery of Fake News Domains from Real-Time Social Media Feeds
Companion Proceedings of the Web Conference, 584-592, 2020
View Details
2020 34 68.0%
Dataprism: Exposing disconnect between data and systems
Proceedings of the 2022 International Conference on Management of Data, 217-231, 2022
View Details
2022 15 67.4%
A GPU-Based Index to Support Interactive Spatio-Temporal Queries over Historical Data
Proceedings of IEEE International Conference on Data Engineering (ICDE …, 2016
View Details
2016 56 66.8%
Applications of executable shopping lists
US Patent 7,103,566, 2006
View Details
2006 83 66.6%
A provenance-based infrastructure to support the life cycle of executable papers
Procedia Computer Science 4, 648-657, 2011
View Details
2011 73 66.1%
ShreX: Managing XML documents in relational databases
Proceedings of the Thirtieth international conference on Very large data …, 2004
View Details
2004 83 65.3%
Debugging machine learning pipelines
Proceedings of the 3rd International workshop on data management for end-to …, 2019
View Details
2019 36 64.6%
Visually exploring transportation schedules
IEEE transactions on visualization and computer graphics 22 (1), 170-179, 2015
View Details
2015 56 64.6%
Capturing both types and constraints in data integration
Proceedings of the 2003 ACM SIGMOD international conference on Management of …, 2003
View Details
2003 80 64.5%
Parallel visualization on large clusters using MapReduce
2011 IEEE Symposium on Large Data Analysis and Visualization, 81-88, 2011
View Details
2011 67 64.0%
Cost-based storage of extensible markup language (XML) data
US Patent App. 10/342,551, 2004
View Details
2004 78 63.9%
A gpu-friendly geometric data model and algebra for spatial queries
Proceedings of the 2020 ACM SIGMOD international conference on management of …, 2020
View Details
2020 29 63.7%
Exploring reproducibility in visualization
IEEE Computer Graphics and Applications 40 (5), 108-119, 2020
View Details
2020 29 63.7%
The open provenance model (v1. 01)
Technical Report 16148, University of Southampton, Intelligence, Agents …, 2008
View Details
2008 70 63.1%
Using vistrails and provenance for teaching scientific visualization
Computer Graphics Forum 30 (1), 75-84, 2011
View Details
2011 63 62.5%
Using vistrails and provenance for teaching scientific visualization
Computer Graphics Forum 30 (1), 75-84, 2011
View Details
2011 62 62.1%
Designing information-preserving mapping schemes for XML
city 9, 10, 2005
View Details
2005 68 61.9%
Weighted minwise hashing beats linear sketching for inner product estimation
Proceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of …, 2023
View Details
2023 7 61.8%
Multilingual schema matching for Wikipedia infoboxes
Proceedings of the VLDB Endowment 5 (2), 133-144, 2011
View Details
2011 60 61.2%
Vismashup: Streamlining the creation of custom visualization applications
IEEE Transactions on Visualization and Computer Graphics 15 (6), 1539-1546, 2009
View Details
2009 62 61.1%
Prov viewer: A graph-based visualization tool for interactive exploration of provenance data
Provenance and Annotation of Data and Processes: 6th International …, 2016
View Details
2016 45 61.0%
Interactive visual exploration of spatio-temporal urban data sets using urbane
Proceedings of the 2018 International Conference on Management of Data, 1693 …, 2018
View Details
2018 37 60.9%
Automl using metadata language embeddings
arXiv preprint arXiv:1910.03698, 2019
View Details
2019 31 60.5%
Exploring a ‘Deep Web’that Google can’t grasp
New York Times 23, B1, 2009
View Details
2009 60 60.3%
Bridging workflow and data provenance using strong links
Scientific and Statistical Database Management: 22nd International …, 2010
View Details
2010 58 59.8%
Sampling Methods for Inner Product Sketching
Proceedings of the VLDB Endowment 17 (9), 2185-2197, 2024
View Details
2024 5 59.3%
Method and system for clustering identified forms
US Patent 7,996,390, 2011
View Details
2011 55 59.2%
Data Debugging and Exploration with Vizier
ACM SIGMOD, 2019
View Details
2019 29 58.6%
Semantica: Version 1.0 (for NEXTSTEP)
MIT Press, 1997
View Details
1997 55 58.5%
Supporting exploratory queries in databases
Database Systems for Advanced Applications: 9th International Conference …, 2004
View Details
2004 59 58.1%
Reproducibility using vistrails
Implementing Reproducible Research 33, 2014
View Details
2014 46 57.8%
Scientific exploration in the era of ocean observatories
Comput. Sci. Eng. 10 (3), 53-58, 2008
View Details
2008 55 57.6%
LegoDB: Customizing relational storage for XML documents
VLDB'02: Proceedings of the 28th International Conference on Very Large …, 2002
View Details
2002 57 57.6%
Collecting and Analyzing Provenance on Interactive Notebooks: When {IPython} Meets {noWorkflow}
7th USENIX workshop on the theory and practice of provenance (TaPP 15), 2015
View Details
2015 43 57.6%
A first study on clustering collections of workflow graphs
Provenance and Annotation of Data and Processes: Second International …, 2008
View Details
2008 54 57.2%
Exploring Traffic Dynamics in Urban Environments Using Vector‐Valued Functions
Computer Graphics Forum 34 (3), 161-170, 2015
View Details
2015 42 56.9%
Searching for efficient XML-to-relational mappings
Database and XML Technologies: First International XML Database Symposium …, 2003
View Details
2003 54 56.3%
Bugdoc: Algorithms to debug computational processes
Proceedings of the 2020 ACM SIGMOD International Conference on Management of …, 2020
View Details
2020 22 56.1%
Personalizing the Web using site descriptions
Proceedings. Tenth International Workshop on Database and Expert Systems …, 1999
View Details
1999 50 56.1%
Synthesizing products for online catalogs
Proceedings of the VLDB Endowment 4 (7), 409-418, 2011
View Details
2011 46 54.8%
Spatio-temporal urban data analysis: A visual analytics perspective
IEEE computer graphics and applications 38 (5), 26-35, 2018
View Details
2018 29 54.3%
Product synthesis from multiple sources
US Patent 8,352,473, 2013
View Details
2013 40 53.4%
Provenance and the different flavors of computational reproducibility
IEEE Data Engineering Bulletin 41 (1), 15, 2018
View Details
2018 28 53.4%
Time Lattice: A Data Structure for the Interactive Visual Analysis of Large Time Series
View Details
2018 28 53.4%
A unified index for spatio-temporal keyword queries
Proceedings of the 25th ACM international on conference on information and …, 2016
View Details
2016 34 53.3%
Enabling advanced visualization tools in a web-based simulation monitoring system
2009 Fifth IEEE International Conference on e-Science, 358-365, 2009
View Details
2009 43 52.7%
Managing provenance for an evolutionary workflow process in a collaborative environment
US Patent App. 11/697,926, 2008
View Details
2008 44 52.7%
Taking I/O Seriously: Resolution Reconsidered for Disk.
ICLP, 198-212, 1997
View Details
1997 40 52.4%
Spade: Gpu-powered spatial database engine for commodity hardware
2022 IEEE 38th International Conference on Data Engineering (ICDE), 2669-2681, 2022
View Details
2022 9 52.3%
Personalizing the web using site descriptions
Proceedings. Tenth International Workshop on Database and Expert Systems …, 1999
View Details
1999 40 51.7%
Using provenance to support real-time collaborative design of workflows
Provenance and Annotation of Data and Processes: Second International …, 2008
View Details
2008 42 51.7%
ARIES: enabling visual exploration and organization of art image collections
IEEE computer graphics and applications 38 (1), 91-108, 2017
View Details
2017 28 50.4%
Finding seeds to bootstrap focused crawlers
World Wide Web 19, 449-474, 2016
View Details
2016 30 50.0%
End-to-end escience: Integrating workflow, query, visualization, and provenance at an ocean observatory
2008 IEEE Fourth International Conference on eScience, 127-134, 2008
View Details
2008 38 49.7%
End-to-end escience: Integrating workflow, query, visualization, and provenance at an ocean observatory
2008 IEEE Fourth International Conference on eScience, 127-134, 2008
View Details
2008 37 49.2%
Bugdoc: A system for debugging computational pipelines
Proceedings of the 2020 ACM SIGMOD International Conference on Management of …, 2020
View Details
2020 17 48.7%
Repeatability and workability evaluation of SIGMOD 2011
ACM SIGMOD Record 40 (2), 45-48, 2011
View Details
2011 35 48.6%
Using mediation to achieve provenance interoperability
2009 Congress on Services-I, 291-298, 2009
View Details
2009 32 46.7%
Exploiting parallelism in tabled evaluations
Programming Languages: Implementations, Logics and Programs: 7th …, 1995
View Details
1995 27 46.5%
Integrated Scientific Workflow Management for the Emulab Network Testbed.
USENIX Annual Technical Conference, General Track, 363-368, 2006
View Details
2006 31 45.7%
On finding templates on web collections
World Wide Web 12, 171-211, 2009
View Details
2009 30 45.5%
Tracking and analyzing the evolution of provenance from scripts
Provenance and Annotation of Data and Processes: 6th International …, 2016
View Details
2016 25 45.3%
Visual Summaries for Graph Collections
IEEE Pacific Vis 2013, 2013
View Details
2013 28 44.7%
Automated development of data processing results
US Patent App. 13/124,201, 2011
View Details
2011 29 44.7%
Understanding how people consume low quality and extreme news using web traffic data
arXiv preprint arXiv:2201.04226, 2022
View Details
2022 7 44.6%
Alphad3m: an open-source automl library for multiple ml tasks
International Conference on Automated Machine Learning, 22/1-22, 2023
View Details
2023 4 44.5%
BugDoc: Iterative debugging and explanation of pipeline
The VLDB Journal 32 (1), 75-101, 2023
View Details
2023 4 44.5%
Towards provenance-enabling paraview
International Provenance and Annotation Workshop, 120-127, 2008
View Details
2008 29 44.4%
How can we make sound replication decisions?
Proceedings of the National Academy of Sciences (PNAS) 122 (5), e2401236121, 2025
View Details
2025 2 44.3%
Information preservation in XML-to-relational mappings
International XML Database Symposium, 66-81, 2004
View Details
2004 28 43.5%
Provenance and Annotation of Data and Processes
Springer, 2008
View Details
2008 27 43.0%
Enabling provenance management for pre-existing applications
US Patent 8,190,633, 2012
View Details
2012 26 42.3%
Exploring What not to Clean in Urban Data: A Study Using New York City Taxi Trips
IEEE Data Engineering Bulletin 39 (2), 63-77, 2016
View Details
2016 22 42.2%
Method and system for adaptive discovery of content on a network
US Patent 8,965,865, 2015
View Details
2015 23 41.8%
Siphon++ a hidden-webcrawler for keyword-based interfaces
Proceedings of the 17th ACM conference on Information and knowledge …, 2008
View Details
2008 25 41.5%
The next 5 years: what opportunities should the database community seize to maximize its impact?
Proceedings of the 2020 ACM SIGMOD International Conference on Management of …, 2020
View Details
2020 13 41.3%
Effective discovery of meaningful outlier relationships
ACM Transactions on Data Science 1 (2), 1-33, 2020
View Details
2020 13 41.3%
Riobusdata: Outlier detection in bus routes of rio de janeiro
arXiv preprint arXiv:1601.06128, 2016
View Details
2016 21 41.1%
Report from Dagstuhl seminar 16041: Reproducibility of data-oriented experiments in e-science
Dagstuhl Reports 6 (1), 108-159, 2016
View Details
2016 21 41.1%
The provenance of workflow upgrades
International Provenance and Annotation Workshop, 2-16, 2010
View Details
2010 24 40.8%
Guest editors' introduction: Provenance in web applications
IEEE Internet Computing 15 (1), 17-21, 2010
View Details
2010 24 40.8%
Towards integrating workflow and database provenance
Provenance and Annotation of Data and Processes: 4th International …, 2012
View Details
2012 24 40.7%
Bridging the XML–Relational Divide with LegoDB: A Demonstration
IEEE International Conference on Data Engineering (ICDE), 759-761, 2003
View Details
2003 22 40.3%
Managing XML data: An abridged overview
Computing in science & engineering 6 (4), 12-19, 2004
View Details
2004 23 40.3%
Fine-grained provenance collection over scripts through program slicing
Provenance and Annotation of Data and Processes: 6th International …, 2016
View Details
2016 20 40.0%
Interactive exploration for domain discovery on the web
Proc. of KDD IDEA, 2016
View Details
2016 20 40.0%
Creating and exploring web form repositories
Proceedings of the 2010 ACM SIGMOD International Conference on Management of …, 2010
View Details
2010 23 40.0%
Provenance and Annotation of Data and Processes: Second International Provenance and Annotation Workshop (IPAW) 2008
Springer, 2008
View Details
2008 22 39.3%
PruSM: a prudent schema matching approach for web forms
Proceedings of the 19th ACM international conference on Information and …, 2010
View Details
2010 22 39.2%
Efficient Acquisition of Web Data through Restricted Query Interfaces.
WWW Posters, 2001
View Details
2001 20 39.1%
Understanding website behavior based on user agent
Proceedings of the 39th International ACM SIGIR conference on Research and …, 2016
View Details
2016 19 38.9%
Active database trigger processing using a trigger gateway
US Patent 6,594,656, 2003
View Details
2003 20 38.8%
An Automatic Framework to Continuously Monitor Multi-Platform Information Spread.
MISINFO@ WWW, 2021
View Details
2021 9 38.3%
Viscaretrails: Visualizing trails in the electronic health record with timed word trees, a pancreas cancer use case
Workshop on Visual Analytics in Healthcare (VAHC), 2011
View Details
2011 21 38.3%
Looking at both the present and the past to efficiently update replicas of web content
Proceedings of the 7th annual ACM international workshop on Web information …, 2005
View Details
2005 20 38.2%
The exception that improves the rule
Proceedings of the Workshop on Human-In-the-Loop Data Analytics, 1-6, 2016
View Details
2016 18 37.7%
An urban data profiler
Proceedings of the 24th International Conference on World Wide Web, 1389-1394, 2015
View Details
2015 19 37.5%
Bootstrapping domain-specific content discovery on the web
The World Wide Web Conference, 1476-1486, 2019
View Details
2019 13 37.4%
Sy. ntactzca
View Details
1996 16 37.3%
A collaborative approach to computational reproducibility
arXiv preprint arXiv:1709.01154, 2017
View Details
2017 16 36.8%
Examining statistics of workflow evolution provenance: A first study
Scientific and Statistical Database Management: 20th International …, 2008
View Details
2008 19 36.7%
ReproZip: the reproducibility packer
Journal of Open Source Software 1 (8), 107, 2016
View Details
2016 17 36.5%
A first study on temporal dynamics of topics on the web
Proceedings of the 25th International Conference Companion on World Wide Web …, 2016
View Details
2016 17 36.5%
Packing experiments for sharing and publication
Proceedings of the 2013 ACM SIGMOD International Conference on Management of …, 2013
View Details
2013 18 35.6%
Interactive audience expansion on large scale online visitor data
Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data …, 2021
View Details
2021 8 35.1%
Designing a provenance-based climate data analysis application
Provenance and Annotation of Data and Processes: 4th International …, 2012
View Details
2012 18 35.1%
Using Pipeline Performance Prediction to Accelerate AutoML Systems
Proceedings of the Seventh Workshop on Data Management for End-to-End …, 2023
View Details
2023 3 35.0%
Managing provenance of the evolutionary development of workflows
US Patent App. 11/697,922, 2008
View Details
2008 17 35.0%
Provenance in web applications
IEEE Internet Computing 15 (1), 2011
View Details
2011 17 34.5%
Web services and information delivery for diverse environments
Proceedings of VLDB Workshop on Technologies for E-Services, 2000
View Details
2000 15 34.2%
Adaptive XML shredding: Architecture, implementation, and challenges
Workshop on Data Integration over the Web, 104-116, 2002
View Details
2002 15 34.1%
Enabling reproducible science with VisTrails
arXiv preprint arXiv:1309.1784, 2013
View Details
2013 16 33.4%
Provenance in scientific workflow systems
IEEE Data Engineering Bulletin, 2007
View Details
2007 15 32.8%
Exploring the coming repositories of reproducible experiments: Challenges and opportunities
Proceedings of the VLDB Endowment 4 (12), 1494-1497, 2011
View Details
2011 15 32.5%
Towards enabling social analysis of scientific data
CHI Social Data Analysis Workshop, 2008
View Details
2008 14 31.9%
DSDD: Domain-Specific Dataset Discovery on the Web
Proceedings of the 30th ACM International Conference on Information …, 2021
View Details
2021 7 31.6%
From papers to practice: the openclean open-source data cleaning library
Proceedings of the VLDB Endowment 14 (12), 2763-2766, 2021
View Details
2021 7 31.6%
DataExposer: exposing disconnect between data and systems
arXiv preprint arXiv:2105.06058, 2021
View Details
2021 7 31.6%
Visualizing the evolution of module workflows
2015 19th International Conference on Information Visualisation, 40-49, 2015
View Details
2015 14 31.6%
Using provenance to streamline data exploration through visualization
Technical Report UUSCI-2006-016, SCI Institute–Univ. of Utah, 2006
View Details
2006 13 31.3%
Making LDAP active with the LTAP gateway: Case study in providing telecom integration and enhanced services
International Workshop on Databases in Telecommunications, 54-73, 1999
View Details
1999 12 31.2%
Provenance and Annotation of Data and Processes, chapter The Open Provenance Model: An Overview
Springer 3, 323-326, 2008
View Details
2008 13 30.8%
Learning to discover domain-specific web content
Proceedings of the Eleventh ACM International Conference on Web Search and …, 2018
View Details
2018 11 30.6%
Provenance-enabled data exploration and visualization with vistrails
Proceedings of the 2010 23RD SIBGRAPI-Conference on Graphics, Patterns and …, 2010
View Details
2010 13 30.2%
Practical problems in coupling deductive engines with relational databases
Proceedings of the 5th Workshop on Knowledge Representation meets Databases …, 1998
View Details
1998 11 30.1%
Bridging Vocabularies to Link Tweets and News
Proceedings of International Workshop on the Web and Databases (WebDB), 2014
View Details
2014 13 29.9%
Using workflow medleys to streamline exploratory tasks
Scientific and Statistical Database Management: 21st International …, 2009
View Details
2009 12 29.9%
Information sharing in science 2.0: Challenges and opportunities
CHI Workshop on The Changing Face of Digital Science: New Practices in …, 2009
View Details
2009 12 29.9%
A Computational Reproducibility Benchmark.
IEEE Data Eng. Bull. 36 (4), 54-59, 2013
View Details
2013 13 29.8%
Automatically constructing a directory of molecular biology databases
Data Integration in the Life Sciences: 4th International Workshop, DILS 2007 …, 2007
View Details
2007 12 29.5%
An information theory approach to detect media bias in news websites
Proc. ACM KDD Workshop Issues Sentiment Discovery Opinion Mining (WISDOM), 1-9, 2020
View Details
2020 8 29.5%
Towards Evaluating Exploratory Model Building Process with AutoML Systems
arXiv preprint arXiv:2009.00449, 2020
View Details
2020 8 29.5%
Vistrails provenance traces for benchmarking
Proceedings of the joint EDBT/ICDT 2013 workshops, 323-324, 2013
View Details
2013 12 28.5%
Towards process provenance for existing applications
Proceedings of the 2nd International Provenance and Annotation Workshop, 120-127, 2008
View Details
2008 11 28.3%
Using wrappers for device independent web access: Opportunities, challenges and limitations
WWW Workshop on Mobile Search, 2002
View Details
2002 10 28.2%
Discovering and measuring malicious url redirection campaigns from fake news domains
2021 IEEE Security and Privacy Workshops (SPW), 1-6, 2021
View Details
2021 6 27.9%
Auctus: A dataset search engine for data augmentation
arXiv preprint arXiv:2102.05716, 2021
View Details
2021 6 27.9%
VisTrails
The Architecture of Open Source Applications, 2011
View Details
2011 11 27.8%
A generic and flexible framework for mapping XML documents into relations
VLDB’04: Proceedings of 30th International Conference on Very Large Data Bases, 2004
View Details
2004 10 27.7%
A model project for reproducible papers: critical temperature for the Ising model on a square lattice
arXiv preprint arXiv:1401.2000, 2014
View Details
2014 11 27.3%
A first study on strategies for generating workflow snippets
Proceedings of the First International Workshop on Keyword Search on …, 2009
View Details
2009 10 27.2%
Automatically extracting form labels
2008 IEEE 24th International Conference on Data Engineering, 1498-1500, 2008
View Details
2008 10 27.1%
Viscomplete: Data-driven suggestions for visualization systems
IEEE Transactions on Visualization and Computer Graphics 14 (6), 1691-1698, 2008
View Details
2008 10 27.1%
ReproServer: making reproducibility easier and less intensive
arXiv preprint arXiv:1808.01406, 2018
View Details
2018 9 27.0%
Real-time clustering for large sparse online visitor data
Proceedings of The Web Conference 2020, 1049-1059, 2020
View Details
2020 7 26.6%
Virtual lightweight snapshots for consistent analytics in NoSQL stores
2016 IEEE 32nd International Conference on Data Engineering (ICDE), 1310-1321, 2016
View Details
2016 10 26.5%
IMAX: Incremental maintenance of schema-based XML statistics
21st International Conference on Data Engineering (ICDE'05), 273-284, 2005
View Details
2005 9 26.5%
Querying Wikipedia documents and relationships
Procceedings of the 13th International Workshop on the Web and Databases, 1-6, 2010
View Details
2010 10 26.3%
Magneto: Combining Small and Large Language Models for Schema Matching
arXiv preprint arXiv:2412.08194, 2024
View Details
2024 2 25.7%
Enhancing Biomedical Schema Matching with LLM-based Training Data Generation
NeurIPS 2024 Third Table Representation Learning Workshop, 2024
View Details
2024 2 25.7%
Simple analysis of priority sampling
2024 Symposium on Simplicity in Algorithms (SOSA), 224-229, 2024
View Details
2024 2 25.7%
Efficiently Estimating Mutual Information Between Attributes Across Tables
2024 IEEE 40th International Conference on Data Engineering (ICDE), 193-206, 2024
View Details
2024 2 25.7%
Typex: A type based approach to XML stream querying
WebDB 2003 International Workshop on Web and Databases, 55-60, 2003
View Details
2003 8 25.3%
Should we all be teaching" intro to data science" instead of" intro to databases"?
Proceedings of the 2014 ACM SIGMOD international conference on Management of …, 2014
View Details
2014 9 24.2%
Towards understanding real-estate ownership in New York City: Opportunities and challenges
Proceedings of the International Workshop on Data Science for Macro-Modeling …, 2014
View Details
2014 9 24.2%
Software infrastructure for exploratory visualization and data analysis: past, present, and future
Journal of Physics: Conference Series 125 (1), 012100, 2008
View Details
2008 8 24.0%
Provenance and scientific workflows: challenges and opportunities
Proceedings of the 2008 ACM SIGMOD international conference on Management of …, 2008
View Details
2008 8 24.0%
Querying and exploring polygamous relationships in urban spatio-temporal data sets
Proceedings of the 2017 ACM International Conference on Management of Data …, 2017
View Details
2017 8 23.9%
Provenance and Reproducibility
Encyclopedia of Database Systems, 2017
View Details
2017 8 23.9%
A gpu-friendly geometric data model and algebra for spatial queries: Extended version
arXiv preprint arXiv:2004.03630, 2020
View Details
2020 6 23.5%
A gpu-friendly geometric data model and algebra for spatial queries: Extended version
arXiv preprint arXiv:2004.03630, 2020
View Details
2020 6 23.5%
XML and data management
WWW-2002 Tutorial, 2002
View Details
2002 7 23.1%
MetaComm: A meta-directory for telecommunications
Proceedings of 16th International Conference on Data Engineering (Cat. No …, 2000
View Details
2000 7 22.9%
MetaComm: A meta-directory for telecommunications
Proceedings of 16th International Conference on Data Engineering (Cat. No …, 2000
View Details
2000 7 22.9%
Reorganizing workflow evolution provenance
6th USENIX Workshop on the Theory and Practice of Provenance (TaPP 2014), 2014
View Details
2014 8 22.5%
Malevolent machine learning
Communications of the ACM 62 (12), 13-15, 2019
View Details
2019 6 22.0%
eTOP: Early Termination of Pipelines for Faster Training of AutoML Systems
arXiv preprint arXiv:2304.08597, 2023
View Details
2023 2 21.8%
XML management for bioinformatics applications
Computing in Science & Engineering 13 (5), 12-23, 2010
View Details
2010 7 21.4%
Combining scheduling strategies in tabled evaluation
Workshop on Parallelism and Implementation Technology for Logic Programming, 1997
View Details
1997 6 21.1%
Visualizing uncertainty with uncertainty multiples
GeoCongress 2006: Geotechnical Engineering in the Information Technology Age …, 2006
View Details
2006 6 20.7%
Winds from Seattle: Database research directions
Proceedings of the VLDB Endowment 13 (12), 3516-3516, 2020
View Details
2020 5 20.1%
VisTrails: Using provenance to streamline data exploration
Poster Proceedings of the International Workshop on Data Integration in the …, 2007
View Details
2007 6 19.8%
Biological Resource Discovery
Encyclopedia of Database Systems, 2017
View Details
2017 6 19.6%
On the connectivity of spaces of three-dimensional tilings
arXiv preprint arXiv:1702.00798, 2017
View Details
2017 6 19.6%
Indexing web form constraints
Journal of Information and Data Management 1 (3), 343, 2010
View Details
2010 6 19.4%
An ecosystem of applications for modeling political violence
Proceedings of the 2021 International Conference on Management of Data, 2384 …, 2021
View Details
2021 4 19.3%
PRIMAD-Information gained by different types of reproducibility
DAGSTUHL REPORTS 6 (1), 128-132, 2016
View Details
2016 6 18.8%
VeriWeb: A platform for automating web site testing
Proceedings of the World Wide Web Conference (WWW)–Web Engineering track, 2002
View Details
2002 5 18.4%
Scheduling Strategies for Evaluation of Recursive Queries over Memory and Disk-Resident Data
PhD thesis, Department of Computer Science, State University of New York, 1997
View Details
1997 5 18.2%
Visualization in radiation oncology: Towards replacing the laboratory notebook
Technical Report UUSCI-2006-017, SCI Institute–Univ. of Utah, 2006
View Details
2006 5 18.2%
Provenance in Workflows
Encyclopedia of Database Systems, 2017
View Details
2017 5 17.2%
Real-time understanding of humanitarian crises via targeted information retrieval
IBM Journal of Research and Development 61 (6), 7: 1-7: 12, 2017
View Details
2017 5 17.2%
Indexing relations on the web
Proceedings of the 13th International Conference on Extending Database …, 2010
View Details
2010 5 17.1%
Riding from Urban Data to Insight Using New York City Taxis
IEEE Data Engineering Bulletin 37 (4), 43-55, 2014
View Details
2014 5 16.4%
Analogy based workflow identification
US Patent 8,762,186, 2014
View Details
2014 5 16.4%
Integrated Analytics and Visualization for Multi-Modality Transportation Data
Connected Cities for Smart Mobility toward Accessible and Resilient …, 2019
View Details
2019 4 15.9%
Understanding spatio-temporal urban processes
2019 IEEE International Conference on Big Data (Big Data), 563-572, 2019
View Details
2019 4 15.9%
The XSB System Version 3.0 Volume 2: Libraries, Interfaces and Packages
Technical report, XSB, 2006
View Details
2006 4 15.6%
Governance of the open provenance model
URL http://twiki. ipaw. info/pub/OPM/WebHome/governance. pdf 10, 2009
View Details
2009 4 15.2%
Why should we teach machines to read charts made for humans?
View Details
2018 4 14.8%
Connecting visualization and data management research (Dagstuhl Seminar 17461)
Dagstuhl, 2018
View Details
2018 4 14.8%
Using latent-structure to detect objects on the web
Procceedings of the 13th International Workshop on the Web and Databases, 1-6, 2010
View Details
2010 4 14.5%
XML Storage
Encyclopedia of Database Systems, 2017
View Details
2017 4 14.5%
Gpu-powered spatial database engine for commodity hardware: Extended version
arXiv preprint arXiv:2203.14362, 2022
View Details
2022 2 12.1%
What we learned about The Gateway Pundit from its own web traffic data.
ICWSM Workshops, 2022
View Details
2022 2 12.1%
The right tool for the job: Data-centric workflows in vizier
Bulletin of the Technical Committee on Data Engineering 45 (3), 2022
View Details
2022 2 12.1%
Diversity and inclusion activities in database conferences: A 2021 report
ACM SIGMOD Record 51 (2), 69-73, 2022
View Details
2022 2 12.1%
XML processing: A comprehensive solution to the XML-to-relational mapping problem
Proceedings of the 6th Annual ACM International Workshop on Web Information …, 2005
View Details
2005 3 12.0%
The open provenance model (v1. 01)
University of Southampton, 2008
View Details
2008 3 11.5%
Prudent schema matching for web forms
Tech. rep., University of Utah, Salt Lake City, UT, USA, 2008
View Details
2008 3 11.5%
Simplifying the design of workflows for large-scale data exploration and visualization
Proceedings of the Microsoft eScience Workshop, 2008
View Details
2008 3 11.5%
Querying structured information sources on the Web
Proceedings of the 10th International Conference on Information Integration …, 2008
View Details
2008 3 11.5%
Second International Provenance and Annotation Workshop, volume 5272 of LNCS
Springer, 2008
View Details
2008 3 11.5%
IPAW
View Details
2013 3 10.9%
noworkflow: Capturing and analyzing provenance of scripts
Provenance and Annotation of Data and Processes: 5th International …, 2015
View Details
2015 3 10.7%
Automatically constructing collections of online database directories
Proceedings of the 15th ACM international conference on Information and …, 2006
View Details
2006 2 6.9%
Desenvolvimento de estruturas de controle explícito para o SGWfC VisTrails
Proceedings of the Brazilian Symposium on Databases (SBBD), 2009
View Details
2009 2 6.8%
Query-driven visualization in the cloud with MapReduce
Proc. of the Fourth Annual Workshop on Ultrascale Visualization, 2009
View Details
2009 2 6.8%
Provenance management: Challenges and opportunities
Datenbanksysteme in Business, Technologie und Web (BTW)–13. Fachtagung des …, 2009
View Details
2009 2 6.8%
Scientific Data Management: Challenges, Technology, and Deployment
Chapman\& Hall/{CRC} Computational Science, 2009
View Details
2009 2 6.8%
Defog: A system for data-backed visual composition
Technical Report UUSCI-2011-003, SCI Institute, University of Utah, 2011
View Details
2011 2 6.7%
Provenance-enabled data exploration and visualization with vistrails
2010 23RD SIBGRAPI-Conference on Graphics, Patterns and Images Tutorials, 1-9, 2010
View Details
2010 2 6.6%
Clustering Wikipedia infoboxes to discover their types
Proceedings of the 21st ACM international conference on Information and …, 2012
View Details
2012 2 6.3%
Whiteboard: a collaborative pen-based annotation tool for e-learning
II Workshop TIDIA, São Paulo, Brazil, 2005
View Details
2005 1 0.0%
Exploring what not to clean in urban data: A study using new york city taxi trips
Data Engineering, 63, 2016
View Details
2016 1 0.0%
Towards Locating and Exploring Hard-to-Find Information on the Web
New York University New York United States, 2018
View Details
2018 1 0.0%
NYUCIN at the NTCIR-16 Dataset Search 2 Task
Proceedings of the 16th NTCIR Conference on Evaluation of Information Access …, 2022
View Details
2022 1 0.0%
Matrix Product Sketching via Coordinated Sampling
arXiv preprint arXiv:2501.17836, 2025
View Details
2025 1 0.0%
The magazine archive includes every article published in Communications of the ACM for over the past 50 years.
Communications of the ACM 65 (8), 72-79, 2022
View Details
2022 1 0.0%
Correction to: BugDoc Iterative debugging and explanation of pipeline executions
The VLDB Journal 32 (2), 473-473, 2023
View Details
2023 1 0.0%
Controlling the Search in Tabled Evaluations.
ILPS, 409, 1997
View Details
1997 1 0.0%
The Provenance of Workflow Upgrades
Provenance and Annotation of Data and Process: Third International …, 2011
View Details
2011 1 0.0%
Special Issue on Data Management beyond Database Systems
IEEE Data Engineering Bulletin, 2012
View Details
2012 1 0.0%
Introduction to the VisTrails System
Technical Report 2, University of Utah, 2012
View Details
2012 1 0.0%
Scheduling in SLG revisited
TAPD'98: tabulation in parsing and deduction (Paris, 2-3 avril 1998), 62-66, 1998
View Details
1998 1 0.0%
A Flexible Infrastructure for Gathering XML Statistics and Estimating Query Cardinality.
ICDE, 857, 2004
View Details
2004 1 0.0%
AlphaD3M: An Open-Source AutoML Library for Multiple ML Tasks
AutoML Conference 2023 (ABCD Track), 2023
View Details
2023 1 0.0%
AutoDDG: Automated Dataset Description Generation using Large Language Models
arXiv preprint arXiv:2502.01050, 2025
View Details
2025 1 0.0%
Bell Labs Research, 600 Mountain Ave., Murray Hill, NJ 07974 (Received 31 May 2000; in nal revised form||)
Information Systems 19 (4), 1-24, 1994
View Details
1994 1 0.0%
Proceedings of the 19th International Conference on World Wide Web (WWW 2010)
Unknown publisher, 2010
View Details
2010 1 0.0%
Computational repeatability: The wikiquery case study
View Details
2011 1 0.0%
Visualization in Radiation Oncology: Towards Replacing the Laboratory Notebook (SCI Institute Technical Report, No. UUSCI-2006-17)
University of Utah, 2006
View Details
2006 1 0.0%
Maximum common subelement metrics and its applications to graphs
arXiv preprint arXiv:1501.06774, 2015
View Details
2015 1 0.0%
Diversity, equity and inclusion activities in database conferences: A 2022 report
ACM SIGMOD Record 52 (2), 38-42, 2023
View Details
2023 1 0.0%
Interactive Data Harmonization with LLM Agents
arXiv preprint arXiv:2502.07132, 2025
View Details
2025 1 0.0%
Towards Supporting Collaborative Data Analysis and Visualization in a Coastal Margin Observatory
CSCW 2010 Workshop on The Changing Dynamics of Scientific Collaboration, 2010
View Details
2010 1 0.0%
Parallelizing Tabled Evaluations Extended Abstract
Workshop on Design and Impl. of Parallel Logic Programming Systems, 18-31, 1994
View Details
1994 1 0.0%
Siphoning Hidden-Web Data through Keyword-Based Interfaces: Retrospective
Journal of Information and Data Management 1 (1), 145-145, 2010
View Details
2010 1 0.0%
The Singularity in Data and Computation-Driven Science: Can It Scale Beyond Machine Learning?
Harvard Data Science Review 6 (1), 2024
View Details
2024 1 0.0%
Prevalence of endangered shark trophies in automated detection of the online wildlife trade
Biological Conservation 304, 2025
View Details
2025 1 0.0%