Files
clang-p2996/compiler-rt/test/fuzzer/dataflow.test
Max Moroz b6e6d3c740 [libFuzzer] Fix DataFlow.cpp logic when tracing long inputs.
Summary:
1. Do not create DFSan labels for the bytes which we do not trace. This is where we run out of labels at the first place.
2. When dumping the traces on the disk, make sure to offset the label identifiers by the number of the first byte in the trace range.
3. For the last label, make sure to write it at the last position of the trace bit string, as that label represents the input size, not any particular byte.

Also fixed the bug with division in python which I've introduced when migrated the scripts to Python3 (`//` is required for integral division).

Otherwise, the scripts are wasting too much time unsuccessfully trying to
collect and process traces from the long inputs. For more context, see
https://github.com/google/oss-fuzz/issues/1632#issuecomment-481761789

Reviewers: kcc

Reviewed By: kcc

Subscribers: delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D60538

llvm-svn: 358311
2019-04-12 21:00:12 +00:00

96 lines
4.7 KiB
Plaintext

# Tests the data flow tracer.
REQUIRES: linux
UNSUPPORTED: aarch64
# Build the tracer and the test.
RUN: %no_fuzzer_cpp_compiler -c -fno-sanitize=all -fsanitize=dataflow %S/../../lib/fuzzer/dataflow/DataFlow.cpp -o %t-DataFlow.o
RUN: %no_fuzzer_cpp_compiler -fno-sanitize=all -fsanitize=dataflow -fsanitize-coverage=trace-pc-guard,pc-table,func,trace-cmp %S/ThreeFunctionsTest.cpp %t-DataFlow.o -o %t-ThreeFunctionsTestDF
RUN: %no_fuzzer_cpp_compiler -fno-sanitize=all -fsanitize=dataflow -fsanitize-coverage=trace-pc-guard,pc-table,func,trace-cmp %S/ExplodeDFSanLabelsTest.cpp %t-DataFlow.o -o %t-ExplodeDFSanLabelsTestDF
RUN: %cpp_compiler %S/ThreeFunctionsTest.cpp -o %t-ThreeFunctionsTest
# Dump the function list.
RUN: %t-ThreeFunctionsTestDF 2>&1 | FileCheck %s --check-prefix=FUNC_LIST
FUNC_LIST-DAG: LLVMFuzzerTestOneInput
FUNC_LIST-DAG: Func1
FUNC_LIST-DAG: Func2
# Prepare the inputs.
RUN: rm -rf %t/IN
RUN: mkdir -p %t/IN
RUN: echo -n ABC > %t/IN/ABC
RUN: echo -n FUABC > %t/IN/FUABC
RUN: echo -n FUZZR > %t/IN/FUZZR
RUN: echo -n FUZZM > %t/IN/FUZZM
RUN: echo -n FUZZMU > %t/IN/FUZZMU
RUN: echo -n 1234567890123456 > %t/IN/1234567890123456
# ABC: No data is used, the only used label is 4 (corresponds to the size)
RUN:%t-ThreeFunctionsTestDF 0 3 %t/IN/ABC | FileCheck %s --check-prefix=IN_ABC
IN_ABC: F{{[012]}} 0001
IN_ABC-NOT: F
# FUABC: First 3 bytes are checked, Func1/Func2 are not called.
RUN:%t-ThreeFunctionsTestDF 0 5 %t/IN/FUABC | FileCheck %s --check-prefix=IN_FUABC
IN_FUABC: F{{[012]}} 111001
IN_FUABC-NOT: F
# FUZZR: 5 bytes are used (4 in one function, 5-th in the other), Func2 is not called.
RUN:%t-ThreeFunctionsTestDF 0 5 %t/IN/FUZZR | FileCheck %s --check-prefix=IN_FUZZR
IN_FUZZR-DAG: F{{[012]}} 111101
IN_FUZZR-DAG: F{{[012]}} 000010
IN_FUZZR-NOT: F
# FUZZM: 5 bytes are used, both Func1 and Func2 are called, Func2 depends only on size (label 6).
RUN:%t-ThreeFunctionsTestDF 0 5 %t/IN/FUZZM | FileCheck %s --check-prefix=IN_FUZZM
IN_FUZZM-DAG: F{{[012]}} 000010
IN_FUZZM-DAG: F{{[012]}} 111101
IN_FUZZM-DAG: F{{[012]}} 000001
# FUZZMU: 6 bytes are used, both Func1 and Func2 are called, Func2 depends on byte 6 and size (label 7)
RUN:%t-ThreeFunctionsTestDF 0 6 %t/IN/FUZZMU | FileCheck %s --check-prefix=IN_FUZZMU
# Test merge_data_flow
RUN:rm -f %t-merge-*
RUN:%t-ThreeFunctionsTestDF 0 2 %t/IN/FUZZMU > %t-merge-1
RUN:%t-ThreeFunctionsTestDF 2 4 %t/IN/FUZZMU > %t-merge-2
RUN:%t-ThreeFunctionsTestDF 4 6 %t/IN/FUZZMU > %t-merge-3
RUN:%libfuzzer_src/scripts/merge_data_flow.py %t-merge-* | FileCheck %s --check-prefix=IN_FUZZMU
# Test collect_data_flow
RUN: %libfuzzer_src/scripts/collect_data_flow.py %t-ThreeFunctionsTestDF %t/IN/FUZZMU | FileCheck %s --check-prefix=IN_FUZZMU
IN_FUZZMU-DAG: F{{[012]}} 0000100
IN_FUZZMU-DAG: F{{[012]}} 1111001
IN_FUZZMU-DAG: F{{[012]}} 0000011
# A very simple test will cause DFSan to die with "out of labels"
RUN: not %t-ExplodeDFSanLabelsTestDF 0 16 %t/IN/1234567890123456 2>&1 | FileCheck %s --check-prefix=OUT_OF_LABELS
OUT_OF_LABELS: ==FATAL: DataFlowSanitizer: out of labels
# However we can run the same test piece by piece.
RUN: %t-ExplodeDFSanLabelsTestDF 0 2 %t/IN/1234567890123456
RUN: %t-ExplodeDFSanLabelsTestDF 2 4 %t/IN/1234567890123456
RUN: %t-ExplodeDFSanLabelsTestDF 4 6 %t/IN/1234567890123456
# Or we can use collect_data_flow
RUN: %libfuzzer_src/scripts/collect_data_flow.py %t-ExplodeDFSanLabelsTestDF %t/IN/1234567890123456
# Test that we can run collect_data_flow on the entire corpus dir
RUN: rm -rf %t/OUT
RUN: %libfuzzer_src/scripts/collect_data_flow.py %t-ThreeFunctionsTestDF %t/IN %t/OUT
RUN: %t-ThreeFunctionsTest -data_flow_trace=%t/OUT -runs=0 -focus_function=Func2 2>&1 | FileCheck %s --check-prefix=USE_DATA_FLOW_TRACE
USE_DATA_FLOW_TRACE: INFO: Focus function is set to 'Func2'
USE_DATA_FLOW_TRACE: INFO: DataFlowTrace: reading from {{.*}}/OUT
USE_DATA_FLOW_TRACE-DAG: a8eefe2fd5d6b32028f355fafa3e739a6bf5edc => |000001|
USE_DATA_FLOW_TRACE-DGA: d28cb407e8e1a702c72d25473f0553d3ec172262 => |0000011|
USE_DATA_FLOW_TRACE: INFO: DataFlowTrace: 6 trace files, 3 functions, 2 traces with focus function
# Test that we can run collect_data_flow on a long input (>2**16 bytes)
RUN: rm -rf %t/OUT
RUN: printf "%0.sA" {1..150001} > %t/IN/very_long_input
RUN: %libfuzzer_src/scripts/collect_data_flow.py %t-ThreeFunctionsTestDF %t/IN/very_long_input %t/OUT | FileCheck %s --check-prefix=COLLECT_TRACE_FOR_LONG_INPUT
RUN: rm %t/IN/very_long_input
COLLECT_TRACE_FOR_LONG_INPUT: ******* Trying:{{[ ]+}}[0, 150001]
COLLECT_TRACE_FOR_LONG_INPUT: ******* Trying:{{[ ]+}}[75000, 150001]
COLLECT_TRACE_FOR_LONG_INPUT: ******* Trying:{{[ ]+}}[112500, 150001]
COLLECT_TRACE_FOR_LONG_INPUT: ******* Success:{{[ ]+}}[{{[0123456789]+}}, 150001]
COLLECT_TRACE_FOR_LONG_INPUT: ******* Success:{{[ ]+}}[0, {{[0123456789]+}}]