HPC Codes-2
HPC Codes-2
HPC Codes-2
1:
Design and implement Parallel Breadth-First Search and Depth First Search based on
existing algorithms using OpenMP. Use a Tree or an undirected graph for BFS and DFS.
Code:
%%cu
#include <iostream>
#include <vector>
#include <queue>
#include <omp.h>
public:
Graph(int V) : V(V), adj(V) {}
visited[startVertex] = true;
q.push(startVertex);
while (!q.empty()) {
int v = q.front();
q.pop();
cout << v << " ";
int main() {
// Create a graph
Graph g(7);
g.addEdge(0, 1);
g.addEdge(0, 2);
g.addEdge(1, 3);
g.addEdge(1, 4);
g.addEdge(2, 5);
g.addEdge(2, 6);
/*
0 -------->1
| /\
| / \
| / \
v v v
2 ----> 3 4
| |
| |
v v
5 6
*/
cout << "Depth-First Search (DFS): ";
g.parallelDFS(0);
cout << endl;
return 0;
}
Output:
Depth-First Search (DFS): 0 1 3 4 2 5 6
Breadth-First Search (BFS): 0 1 2 3 4 5 6
Assignment No. 2:
Write a program to implement Parallel Bubble Sort and Merge sort using OpenMP. Use
existing algorithms and measure the performance of sequential and parallel algorithms.
// Synchronize
#pragma omp barrier
int main(){
// Set up variables
int n = 10;
int arr[n];
int brr[n];
double start_time, end_time;
// Sequential time
start_time = omp_get_wtime();
bubble(arr, n);
end_time = omp_get_wtime();
cout << "Sequential Bubble Sort took : " << end_time - start_time << " seconds.\n";
printArray(arr, n);
// Parallel time
start_time = omp_get_wtime();
pBubble(arr, n);
end_time = omp_get_wtime();
cout << "Parallel Bubble Sort took : " << end_time - start_time << " seconds.\n";
printArray(arr, n);
}
Output:
Sequential Bubble Sort took : 0.00957767 seconds.
Parallel Bubble Sort took : 0.00988083 seconds.
int main() {
int n = 1000;
int arr[n];
double start_time, end_time;
return 0;
}
Output:
Time taken by sequential algorithm: 0.000135859 seconds
Time taken by parallel algorithm: 0.000123855 seconds
Assignment No. 3:
Implement Min, Max, Sum and Average operations using Parallel Reduction.
.cpp Code:
%%cu
/*
Windows does not support user defined reductions.
This program may not run on MVSC++ compilers for Windows.
Please use Linux Environment.[You can try using "windows subsystem for linux"]
*/
#include<iostream>
#include<omp.h>
int main(){
int n = 5;
int arr[] = {1,2,3,4,5};
cout << "The minimum value is: " << minval(arr, n) << '\n';
cout << "The maximum value is: " << maxval(arr, n) << '\n';
cout << "The summation is: " << sum(arr, n) << '\n';
cout << "The average is: " << average(arr, n) << '\n';
return 0;
}
Output:
The minimum value is: 1
The maximum value is: 5
The summation is: 15
The average is: 3
Assignment No. 4:
Write a CUDA Program for :
1. Addition of two large vectors
2. Matrix Multiplication using CUDA C
int main() {
int N = 4;
int* A, * B, * C;
int vectorSize = N;
size_t vectorBytes = vectorSize * sizeof(int);
A = new int[vectorSize];
B = new int[vectorSize];
C = new int[vectorSize];
initialize(A, vectorSize);
initialize(B, vectorSize);
cout << "Vector A: ";
print(A, N);
cout << "Vector B: ";
print(B, N);
int* X, * Y, * Z;
cudaMalloc(&X, vectorBytes);
cudaMalloc(&Y, vectorBytes);
cudaMalloc(&Z, vectorBytes);
delete[] A;
delete[] B;
delete[] C;
cudaFree(X);
cudaFree(Y);
cudaFree(Z);
return 0;
}
Output:
Vector A: 3 6 7 5
Vector B: 3 5 6 2
Addition: 6 11 13 7
int main() {
int* A, * B, * C;
int N = 2;
int blockSize = 16;
int matrixSize = N * N;
size_t matrixBytes = matrixSize * sizeof(int);
A = new int[matrixSize];
B = new int[matrixSize];
C = new int[matrixSize];
initialize(A, N);
initialize(B, N);
cout << "Matrix A: \n";
print(A, N);
int* X, * Y, * Z;
// Allocate space
cudaMalloc(&X, matrixBytes);
cudaMalloc(&Y, matrixBytes);
cudaMalloc(&Z, matrixBytes);
// Launch kernel
multiply<<<blocks, threads>>>(X, Y, Z, N);
delete[] A;
delete[] B;
delete[] C;
cudaFree(X);
cudaFree(Y);
cudaFree(Z);
return 0;
}
Output:
Matrix A:
36
75
Matrix B:
35
62