DATA ANALYSIS
Classified in Computers
Written at on English with a size of 4.86 KB.
Acid Rain: Concurrency related attacks of database-Backed web applicatins
MOTIVATION:
•12 Popular self hosted e-commerce
Applications (deployed over 2M websites
Which represent over 50% of entire ecommerce websites)
• 22 critical ACIDRain attacks identified and
Verified
•Flex coin->bankrupted.
PROBLEM DEFINITION:
An application is vulnerable if
– Anomalies possible (Under concurrent API
Access, the appli- cation may exhibit behaviors
(i.E., anomalies) that could not have arisen
Under a serial execution. )
– Sensitive invariants. The anomalies arising
From concurrent access lead to violations of
Application invariants.
SOLUTION:
• Execute API calls against a live application
And database to generate a (possibly sequential) trace of database activity.
• Analyze the trace for potential anomalies
That could arise under concurrent
Execution.
PREVENTION:
– Select for update
– User level concurrency control.
• Prevent concurrent calls in the same session
– Single read of data
– Multiplevalidations.
• They opened issues on Github
• 18 different vulnerabilities reported
• 7 of them are confirmed
• One of them has a feedback “use your
Brain! Its [sic] not hard to come up with a
Solution that does not involve coding.
Democratizing Data Science through Interactive
Curation of ML Pipelines
INTRODUCTION to Alpine Meadow:
**A new automated machine learning tool, Alpine Meadow, is developed
-- The tool is interactive, claimed to be an important aspect for fast operation -- The tool selects among the options the best possible setting for a given ML
Task and constructs a final pipeline in an automated way
* It is claimed to outperform many of its equivalents in a number of cases
AutoML APPROACH::
AutoML or “Learning to Learn” is an approach in ML systems. The aim is to give the best prediction result in the fastest way. Based on automatically selecting the best possible; data processing steps-- learning algorithm-- hyperparameters.
THE NEED FOR IMPROVEMENTS IN AutoML:
Current AutoML systems
Take days, weeks to complete. Usually do not allow user intervention. Providing a quick response is important. If computing powers are at the limit, user interaction might save time.
CONTRIBUTIONS OF ALPINE MEADOW:
The tool is interactive; showing the positive aspects of interactions. Pipeline selection provides new insights for AutoML approaches.
*THE SYSTEM OVERVIEW:
*The approach is started with mimicking a data scientist. *Optimize and automate all the step as much as possible.* Build many pipelines; select the most plausible one; show the results of the
Best one.* Selection process is analogous to query optimization.
JOSIE: Overlap Set Similarity Search
For Finding Joinable Tables in Data
Lakes
INTRODUCTION:
A solution for finding joinable tables proposed
- A table and a join column given, the system
Finds all the joinable tables with a cheaper
Method: overlap set similarity search.
CONCLUSION:- JOSIE with its cost function is adaptive to new data
Distributions
- Suitable for large data lakes
- Outperforms an approximate algorithm
- As the future work
- Estimation of set intersection size based on token
Frequencies
- Using past statistics, automated selection of query
Columns
- Fuzzy join