Automatically Generating Predicates and Solutions for Configuration Troubleshooting
* Ya-Yunn Su NEC Laboratories
America
Jason Flinn University of
Michigan
Troubleshooting misconfigurations is hard!
• Users may have to
– Edit configuration files
– Resolve library dependencies – Change environment variables
• Automated troubleshooting tools can help
– Chronus: finds when a misconfiguration entered – AutoBash: automatically resolves misconfigurations
Current method: manual predicate creation
• Predicates
– Test if an application works or not
– Returns true/false if the test passes/fails
• E.g. test if an Apache Web server is working
• Manually writing predicates requires
– Experts and time – Domain knowledge
• Can we automatically generate predicates?
3wget http://localhost
Limitations in existing approaches
• Automatic test case generation requires
– Program source code or specifications
• Automatic solution generation requires
– Golden state as a reference
• Users already troubleshoot misconfigurations
– They try potential solutions – They test if a solution works
Generating predicates from user traces
• Users troubleshoot using our modified shell
• Our modified shell generates:
– Which command is a predicate – If a predicate succeeds/fails – Which commands are solutions
5
% command A
% command B
% command C
% command D
Commands A & B Predicate
True or False Solution
Command D
test result
Goals
• Minimize false positives
– A false positive is worse than a false negative – Aggregate across multiple user traces
• Be as unobtrusive as possible
– Users do not need to provide extra input
• Generate complete predicates
Minimizing false positives
• Observation: troubleshooting pattern
– Users test the system state multiple times – Users rely on output to know test outcome
• Generate predicates following this pattern
System was not working System was working 7
Command C0 = False Command C1 = True Time
Command C Command C
• Predicates
– Repeated commands
– Differ in more than two out of three output features
• Output features for a command:
– exit code: the return value of a process – screen output contains error message
– output set: kernel objects a command modifies
Our approach
Tracking output sets
• Output set: kernel objects a command causally affects
9
Command: echo hi > foo
echo
foo create
Process echo File metadata foo
File content foo Directory entry foo
Output set = {file foo}
Output set echo
exits Process
forked
% cvs –d /home/cvsroot import test_project
cvs [import aborted]: /home/cvsroot/CVSROOT:
No such file or directory
% cvs –d /home/cvsroot init
% cvs –d /home/cvsroot import test_project N test_project/testfile
No conflicts created by this import
Example
Problem: CVS repository not initialized
11
Example
exit code = 1
% cvs –d /home/cvsroot import test_project
cvs [import aborted]: /home/cvsroot/CVSROOT:
No such file or directory
% cvs –d /home/cvsroot init
% cvs –d /home/cvsroot import test_project N test_project/testfile
No conflicts created by this import
• Find repeated commands
Example
% cvs –d /home/cvsroot import test_project
cvs [import aborted]: /home/cvsroot/CVSROOT:
No such file or directory
% cvs –d /home/cvsroot init
% cvs –d /home/cvsroot import test_project N test_project/testfile
No conflicts created by this import
• Compare output features of repeated commands
13
Example
% cvs –d /home/cvsroot import test_project
cvs [import aborted]: /home/cvsroot/CVSROOT:
No such file or directory
% cvs –d /home/cvsroot init
% cvs –d /home/cvsroot import test_project N test_project/testfile
No conflicts created by this import
exit code = 0 exit code = 1
Output feature: exit codes differ
Example
% cvs –d /home/cvsroot import test_project
cvs [import aborted]: /home/cvsroot/CVSROOT:
No such file or directory
% cvs –d /home/cvsroot init
% cvs –d /home/cvsroot import test_project N test_project/testfile
No conflicts created by this import
No error message
• Output feature: screen outputs differ
15
Example
% cvs –d /home/cvsroot import test_project
cvs [import aborted]: /home/cvsroot/CVSROOT:
No such file or directory
% cvs –d /home/cvsroot init
% cvs –d /home/cvsroot import test_project N test_project/testfile
No conflicts created by this import
• Output feature: output sets differ
• First execution: output set is empty
• Second execution: output set contains created files
=> Output set = {}
=> Output set = {file:/home/cvsroot/
test_project, …}
Example
exit code = 1
% cvs –d /home/cvsroot import test_project
cvs [import aborted]: /home/cvsroot/CVSROOT:
No such file or directory
% cvs –d /home/cvsroot init
% cvs –d /home/cvsroot import test_project N test_project/testfile
No conflicts created by this import
• Repeated commands differ in three output features
=> predicate succeeds
=> predicate fails
17
Generating complete predicates
• Some predicates depend on preconditions to be executed first to work correctly
user1 % cvs –d /home/cvsroot import test_project
user2 % cvs –d /home/cvsroot checkout test_project root % usermod –G cvsgroup user2
user2 % cvs –d /home/cvsroot checkout test_project => Predicate kkkksucceeds
=> Predicate fails
Problem: user2 is not in CVS group Initial state: CVS repository is empty
Precondition
Solution
How?
Causal relationships between commands
foo read
echo
exited Output set
% echo hi > foo
% cat foo
cat
File metadata foo File content foo Directory entry foo
echo hi > foo
• “cat foo” causally depends on “echo
Output set
19
Applying causality to find preconditions
• Candidate preconditions: find
– All commands a predicate depends on
– All commands whose output set a predicate is in
cvs co as user2 succeeds cvs co as
user2 fails cvs import
as user1 Add user2 to
CVS group
modifies modifies
File:
test_project
File: /etc/
group
Time We also find solution!
Heuristic to differentiate them
• Solutions: occurred after all failed predicates
• Preconditions:
– occurred before any failed predicate
Time
cvs import as user1
cvs co as user2 succeeds cvs co as
user2 fails
Add user2 to CVS group
Solution Precondition
21
Ranking solutions
• Users solve the same problem differently
• Goal: better solutions are ranked higher
– Heuristic: solutions applied by more users are better – Aggregate solutions among traces and rank them
• Ex. Apache not having search permission
– chmod 777 /home/USERID – chmod 755 USERID/
– chmod 755 /home/USERID
Different commands can be used to do the same thing.
• State delta: the difference in system state caused by the execution of a command
– Track output set for that command
– Compute diff for each entity in the output set
• Solution ranking results:
1. chmod 755 /home/USERID
2. chmod 755 USERID/ 1. chmod 777 /home/USERID Group 1 (size = 2) Group 2 (size = 1)
Group solutions by state delta
23
Evaluation
• Questions:
– How well can we generate predicates?
– How well does the solution ranking heuristic work?
• Methodology
– Conducted a user study of user troubleshooting – Generate predicates/solutions from traces
– Manually verify predicate correctness
User study procedure
• 12 participants:
– graduate students
– system administrators
• Each given four configuration problems
– Two CVS and two Apache configuration problems – Each problem runs in a virtual machine
• Collected traces of users troubleshooting
Predicate result summary
25
CVS problem 1
CVS problem 2
Apache problem 1
Apache problem2
# of correct
predicates 4 4 6 8
# of wrong
predicates 0 0 1 1
Total # of
traces 10 10 11 11
• All correct predicates are complete
• Very few wrong predicates (false positives)
• Both false positives come from traces of user not solving the problem
• Why were no predicates generated for some traces?
Apache problem: predicate results
• Problem: Apache process not having search permission on /home/USERID
• Solution: give /home/USERID search permission
Predicates Generated Number of traces No predicate generated (User did not
use repeated commands) 3
No predicate generated (User did not
fix the problem) 2
Incorrect predicate (User did not fix 1
27
Apache problem: predicate results
• Problem: Apache process not having search permission on /home/USERID
• Solution: give /home/USERID search permission
Predicates Generated Number of traces No predicate generated (User did not
use repeated commands) 3
No predicate generated (User did not
fix the problem) 2
Incorrect predicate (User did not fix
the problem) 1
• User did not fix the problem => output features did not differ
Apache problem: predicate results
• Problem: Apache process not having search permission on /home/USERID
• Solution: give /home/USERID search permission
Predicates Generated Number of traces No predicate generated (User did not
use repeated commands) 3
No predicate generated (User did not
fix the problem) 2
Incorrect predicate (User did not fix
the problem) 1
29
Apache problem: solution ranking results
Why is editing configuration file a solution?
• Predicate: apachectl stop
• User-introduced errors in conf file caused apachectl stop fail
Solution Number of Traces
chmod 755 /home/USERID 2
chmod –R 777 USERID/ 1
chmod o+rx /home/USERID 1
chmod 777 /home/USERID 1
vim /etc/httpd/conf/httpd.conf 1
Future work
• Extend this work to handle GUI applications
• Challenges:
– identifying individual tasks, finding repeated tasks – exit code does not map to each task
• Advantages: more semantic information
Conclusion
• Automatically generate predicates and solutions from user troubleshooting traces
• Our approach
– Minimizes false positives – Is unobtrusive to users
– Generates complete predicates
31